Boundary Conditions for a Series of Machine Learning Models
Matthew Smith R Shenanigans
by
4y ago
Machine Learning at the Boundary: There is nothing new in the fact that many machine learning models can outperform traditional econometric models but I want to show as part of my research why and how some models make given predictions or in this instance classifications. I wanted to show the decision boundary in which my binary classification model was making. That is, I wanted to show the partition space that splits my classification into each class. The problem and code can be split into a multi-classification problem with some tweeks. Initialisation: I first load in a series of packages ..read more
Visit website
Machine Learning (XGBoost) Time-Series Classification Trading Strategy
Matthew Smith R Shenanigans
by
4y ago
Introduction Using Machine Learning (ML) and past price data to predict the next periods price or direction in the stock market is not new, neither does it produce any meaningful predictions. In this post I collapse down a series of asset time series data into a simple classification problem and see if a Machine Learning model can do a better job at predicting the next periods direction. I apply a similar method here Time Series Classification Synthetic vs Real Financial Time Series. The objective and strategy is to invest in a single asset each day. The asset we invest in will be the asset w ..read more
Visit website
Quantitative Analytics: Optimal Portfolio Allocation
Matthew Smith R Shenanigans
by
4y ago
Introduction: The literature in portfolio optimisation has been around for decades. In this post I cover a number of traditional portfolio optimisation models. The general aim is to select a portfolio of assets out of a set of all possible portfolios being considered with a defined objective function. The data: The data is collected using the tidyquant() package’s tq_get() function. I then convert the daily asset prices to daily log returns using the periodReturn function from the quantmod() package. Next I construct lists of 6 months worth of daily returns using the rolling_origin() functio ..read more
Visit website
Quantitative Trading Strategies using quantmod
Matthew Smith R Shenanigans
by
4y ago
Strategies Introduction: This post goes through a number of the technical trading functions in the TTR package here. I add definitions and show examples of how the functions work. library(dplyr) library(quantmod) library(tidyquant) library(TTR) library(timetk) library(tidyr) library(ggplot2) library(directlabels) library(data.table) library(quantstrat) library(purrr) library(quantstrat) We can download some data using the quantmod package which stores the different assets into our Global Environment. start_date <- "2013-01-01" end_date <- "2017-08-31" symbols <- c("AAPL", "AMD", "A ..read more
Visit website
Factor Modeling in R
Matthew Smith R Shenanigans
by
4y ago
The most popular models for analysing the risk of portfolios are factor models, since stocks have a tendency to move together. The principal component of securities often explains a large share of it’s variance. Since we are mostly concerned with multiple assets which form a portfolio we need to account for this. Some questions might be why do stocks with low Price to Book ratios outperform stocks with higher Price to Book ratios? Here the “Price” part of the ratio is simply the share price (per share) and the “Book” part of the ratio is the “Shareholders Equity” / “Shares Outstanding” which a ..read more
Visit website
Time Series Classification Synthetic vs Real Financial Time Series
Matthew Smith R Shenanigans
by
4y ago
::Note:: This is a long post but I talk about the procedure I took when dealing with a specific time series classification task. I was given a “Data Science” challenge as part of an interview in which I had to distinguish between real financial time series and synthetic time series. I document the results here, the data was anonymous and I have no idea which assets were which or from what time series the assets came from. To conclude I obtained an in-sample-test-accuracy of 67% and an out-of-sample-test-accuracy of 65% (based on what the interviewers told me) All I knew was that I had 12,000 r ..read more
Visit website
Learning the Quantstrat and Blotter packages
Matthew Smith R Shenanigans
by
4y ago
Load packages: library(knitr) library(kableExtra) library(dplyr) library(ggplot2) library(quantstrat) ::: Note ::: This post is mostly for my future reference/documentation for learning the quantstrat package. An example of a strategy I developed can be found below in which it uses a naive rolling logistic regression model trained on t days to predict t+1 market movement. ::: END Note ::: I have been playing around with backtesting trading models using the quantstrat package for a while but the most difficult thing about it was understanding the syntax of blotter and quantstrat, it didn’t seem ..read more
Visit website
Need for Funds (NFO) and Working Capital (WC) Calculations
Matthew Smith R Shenanigans
by
4y ago
Corporate Finance and Managerial Finance I created a script to download any company from Yahoo Finance and compute the DuPont, Need for Funds and Working Capital based on companies fundamentals and plot these for each company. I suggest you first watch these two short videos from IESE Business School - part 1 here and part 2 here. Firstly load in some packages: library(knitr) library(kableExtra) library(tidyquant) library(quantmod) library(tibble) library(tidyr) library(reshape2) library(ggplot2) library(rvest) I first define the function to scrape Yahoo Finance and collect the Balance Sheet ..read more
Visit website
Series of Hackerrank Competitions
Matthew Smith R Shenanigans
by
4y ago
library(knitr) library(kableExtra) library(ggplot2) library(tidyquant) library(dplyr) Problem 1: Battery Life Prediction Problem The first problem was to predict how long a laptop would last given the number of hours it was charged. The problem is here We can read in the data directly from the hackerrank website using the following: BatteryData <- read.table(file = url("https://s3.amazonaws.com/hr-testcases/399/assets/trainingdata.txt"), strip.white = TRUE, sep = ',', header = FALSE, skip = 0) Table 1: Battery Life Data V1 V2 2.81 5.62 7.14 8.00 2.72 5.44 3.87 7.74 ..read more
Visit website
Kalman Smoothing for Time Series Missing Value Imputation
Matthew Smith R Shenanigans
by
4y ago
I was recently given a task to impute some time series missing values for a prediction problem. Python has the TSFRESH package which is pretty well documented but I wanted to apply something using R. I opted for a model from statistics and control theory, called Kalman Smoothing which is available in the imputeTS package in R. I went with smoothing over filtering since the Kalman filter takes a forward pass through the data and uses all the data upto the current time point and can be done in real time. Kalman smoothing adds a backward pass through the data and thus uses all the data. I guess i ..read more
Visit website

Follow Matthew Smith R Shenanigans on FeedSpot

Continue with Google
Continue with Apple
OR