Day Performance Similarities of Multiple Listed Companies - A Case Study

Posted on Jul 26, 2017

The skills the authors demonstrated here can be learned through taking Data Science with Machine Learning bootcamp with NYC Data Science Academy.

Idea

Some public companies are listed on more than one stock exchange. They are called โ€œdual-listedโ€ or โ€œcross-listedโ€ depending on the legal implementation (this distinction is not taken into account in the following).
If the stock exchanges are in different time zones, then the stocks of one company are traded twice in 24 hours. We want to investigate if the day performance on the previous stock exchange has some predictive signal for the following one. To assess the presence and usefulness of such a signal, a simple trading model is defined and applied over the past data.

Stock Exchange Selection

The trading times of the stock exchanges should not overlap, so that a completed trading day can be used as input. New York lies in the GMT-5 time zone so appropriate stock exchanges can be found in Far East. I have selected the Bombai Stock Exchange (BSE) in India in the time zone GMT+5.5; the choice was random.

Companies Selection

After some online searching Google returned the following site with companies which are listed on NASDAQ/NYSE and BSE: http://www.goodreturns.in/classroom/2015/06/indian-companies-listed-on-nasdaq-or-nyse-369055.html
As next step I tried to find the according stock symbols on NYSE/NASDAQ and BSE. I could not find or match all of them on both stock exchanges and so the following companies remained:

Company US Stock_Exch. IN Stock Exch. Sector US Symbol IN Symbol Comment
ICICI Bank NYSE BSE Banks IBN ICICIBANK ย 
Infosys NYSE BSE Software & Computer Services INFY INFY ย 
Vedanta Limited NYSE BSE Indust.Metals & Mining VEDL VEDL ย 
Tata Motors NYSE BSE Industrial Engineer TTM TATAMOTORS ย 
Videocon d2h NASDAQ BSE TV Services VDTH VIDEOIND VIDEOIND is the group.
Wipro NYSE BSE Software & Computer Svc WIT WIPRO ย 

Videocon d2h (VDTH) is a subsidiary of the group Videocon Industries Limited (VIDEOIND) and therefore does not match the precondition of a multiple listed company, but I obtained also this data to have a look at such a case. Additionally, data for VDTH are not available until 2015 when VDTH was established.

Data Gathering and Preprocessing

Daily historic stock data of the last ten years can be downloaded on the websites of the stock exchanges as CSV files:

We have to consider two cases for our intention. We can trade in New York and use the day performances of the BSE as indicators or vice versa. The goal was to provide one table where each row is an observation:

flip us_symbol in_symbol us_date in_date us_open us_close in_open in_close us_perf in_perf
in->us WIT WIPRO 13.07.2007 13.07.2007 4,2865 4,2974 519 512,6 0,00254 -0,01233
in->us WIT WIPRO 16.07.2007 16.07.2007 4,2513 4,2215 514,9 500,7 -0,00701 -0,02758
in->us WIT WIPRO 17.07.2007 17.07.2007 4,2052 4,2459 500 505,6 0,00968 0,01120

The column flip describes the trading scenario, โ€œin->usโ€ stands for trading in the US with the data from India as indicator. The value โ€œus->inโ€ marks the other case. The table only contains observations where the counterpart stock exchange has provided data in the previous 24 hours. Dates with a longer time lap, due to weekends or public holidays, are omitted. For the โ€œin->usโ€ case, this means that the Indian and the US date are the same. For the โ€œus->inโ€ case, this means that the Indian date is one day after the US date. The day performance is calculated with the formula: (close_price โ€“ open_price) / open_price.

Idealistic Trading Model

To evaluate the usefulness/worth of the previous day performance we define a simplified trading model. If the day performance at the previous stock exchange is greater than zero, then we buy the stock with the opening price and sell it with the closing price. Accordingly, we assume the day performance on the local stock exchange as the trading win of loss. The trigger ย for trading is the day performance of the stock at the previous, remote stock exchange. Trading fees are not taken into account.

First Glance

Applying this simple trading model to our data from 2008-01-01 to 2017-01-01 gives the following result:

Case โ€œin->usโ€, trading at NASDAQ/NYSE:

The trading model generates an impressive gain of 1349.15%. The table above the chart gives some key metrics. The rows describe three different trading scenarios. The โ€œTradingโ€ row reflects our trading model. The โ€œLocalโ€ row reflects the summary of the day performance of all dates in the dataset (in the time range) at the local stock exchange (in this case NASDAQ/NYSE). The โ€œRemoteโ€ row reflects the summary of the day performances of all dates in the dataset (in the time range) at the remote stock exchange (in this case BSE).
The columns are described below:

Sum Perf.(%) Sum of the day performances in present.
Count Count of rows. When multiplied by 2 it gives the number of necessary trades (buying and selling).
Mean Perf.(%) The mean performance of the day performances in percent.
Median Perf.(%) The median performance of the day performances in percent.
Std.dev Perf (%) The standard deviation of the day performances in percent.
Pos.Days(%) The percent of positive day performances.
Avg. Win pos. Days(%) The average day performance of positive day performances in percent.
Avg. Loss neg. Days (%) The average day performance of zero or negative day performances in percent.

The comparison of the โ€œTradingโ€ row with the โ€œLocalโ€ row shows that the trading model has more positive dates, i.e. more dates with a win. Additionally the average gain on positive dates is greater than the average win of all positive days, and the average loss on negative days is smaller than the average loss on all negative dates. In combination, this effect is strong enough to produce a gain of 1349%.

Case โ€œus->inโ€, trading at BSE:

The total gain in India is ca. 712%. While not as big as the gain in New York, it is still remarkable.
Noticeable is the performance difference between the trading model and the local row. A trade on each date in the dataset would have resulted in a loss of ca. 849%, whereas the trading model yields substantial profit.

Overall, it is obvious that the day performance of the previous, remote stock exchange contains a valuable signal that it is worth to observing.
In total the performance growths for the case โ€œin->usโ€ and โ€œus->inโ€, but there are some interesting details. For example is the performance of WIT and VDTH negative in the โ€œus->inโ€ case and in the โ€œin->usโ€ case, WIT seems to yield no gain since 2010.
Before we dive deeper into the data, we take a closer look at the correlation of the stock prices of the companies.

Stock Price Correlations

A scatterplot with the US and IN stock prices from 2008-01-01 to 2017-07-01 looks somewhat weird:

The reasons for the different patches of the same company are stock splits performed in India. For example, IBN performed one at the end of 2015.

The time range from 2008-01-01 to 2014-09-01 is strongly correlated:

Also the time range afterwards (2015-01-01 โ€“ 2017-07-01):

Considering the stock splits and comparing each company separately and without the split events the Pearson correlations of the stock prices are always greater than 80%, except VDTH, where the correlation factor is only 73%. However, considering the circumstance that the US company VDTH is a subsidiary of the Indian group VIDEOIND it is still strong.

Day Performance Correlations

After the stock price correlations, we turn to the day performance correlations, i.e. the relationship between the day performance of the previous stock exchange and the current/local one.

The correlations per company in the time range 2008-01-01 to 2017-07-01 are as following:

Trading in US, โ€œin->usโ€:

The correlation factors are much weaker than the ones of the stock prices. However, there is a measurable relationship, except for VDTH.
The total correlation over all companies is 15%.

Trading in India, case โ€œus->inโ€:

The correlation factors are also much weaker than the ones of the stock prices and additional they are weaker as in the โ€œin->usโ€ case. This could explain the lower overall trading performance there. The Pearson correlation of VDTH is even negative, and the VEDL and WIT ones are considerably low. Removing VDTH and calculating the overall trading performance from 2008-01-01 to 2017-07-01 yields a gain of 776.36%, 63.38% more than with VDTH. Removing VDTH and WIT yields an overall trading performance of 805.42% (92.44% more).
Because of the low overall day performance correlation of VDTH, it is not evaluated anymore below.

To investigate the correlations in more detail we introduce two new measures, a sliding correlation and sliding trading performance.

Sliding Correlation and Trading Performance

The sliding correlation calculates for each data point (=date/row in our dataset) the correlation between local and remote performance of the last N rows/dates before. In the Shiny App, N can be set to a value in between 10 and 120. In the following, I used the value 60, which is a good tradeoff between accuracy and noise.ย  The sliding trading performance is calculated the same way. For the last N rows the trading model is applied and the win/loss is calculated.

Investigating the distribution of the sliding correlation and the sliding trading performance for the time range from 2008-01-01 to 2017-07-01 gives the following boxplots:

Trading in US, case โ€œin->usโ€:

Trading in India, case โ€œus->inโ€:

The sliding correlation factors are predominantly positive. The sliding trading performance lies around zero, but still a bit above, which is enough to generate a positive overall trading performance.

The following tables show the correlation between the sliding correlation factors and the sliding trading performance:

Trading in US, โ€œin->usโ€ case:

Trading in India, โ€œus->inโ€ case:

The correlations are distinctive and show that the trading performance depends on the day performance correlations. However, the correlation factors for different companies varies noticeably. We should also consider that we only trade when the remote day performance is positive. Thus it can happen, that there is a strong sliding correlation but the trading performance is zero, because all remote day performances are negative. That means, even with a strong remote/local correlation of the day performances, it is not guaranteed that enough remote day performances are positive and that the trades triggered would yield a profit.

In the following we look onto some of the companies from 2015-01-01 to 2017-05-14 for the โ€œin->usโ€ case.

ICICI Bank (IBN)

The overall performance in the time range is ca. 12.9% and clear over the every-day-trading performance of -29.13%.

The trading performance changed noticeable over the time with a clear negative peak in the first quarter of 2016 but recovers later.

The sliding correlation plot gives an explanation for the observation above. Starting at the middle of 2015 the correlation decreases and went even negative, causing substantial trading losses. During 2016 the correlation recovered and went clearly positive, driving the trading performance positive again but seems to decrease again in 2017.

The overall correlation of the local and remote day performances are given in the following table:

The evaluation shows the volatility of the day performance correlation and how it drives the trading performance up and down. Overall, there is a profit of 12.87% but during 2015 it was strongly negative.

Vedenta Limited (VEDL)

The overall trading performance is impressive; it is noticeable that even every-day-trading would had yield a gain of 49.27%.

The trading performance increased constantly and almost linearly, pretty ideally.

The sliding correlation plot shows some volatility, but it stayed positive over the complete time span.

The overall correlation of the local and remote day performances is high.

VEDL is an example for an optimal case, a strong and constant positive correlation of the day performances that yields to a linear trading gain.

Infosys (INFY)

The overall trading performance is low and smaller than the every-day-trading performance. That means that our trading model does not generated any advantage, quite the contrary.

The trading performance moved up and down with a strong positive peak in 2016 but decreased again.

The sliding correlation plot shows long phases of strong negative correlations.

The overall correlation of the day performances is only 1% and therefore practically not present. However, the sliding correlation plot makes the impression that positive and negative phases are not by random completely (first half of each year). If it would be possible to avoid the negative phases, a positive trading might be possible.

Wipro (WIT)

The overall performance is clearly positive with 26.57% but only 5% over the every-day-trading performance.

The trading performance increased until the middle of 2016 and then went over in a horizontal movement, generating no additional gain.

The sliding correlation plot shows an increasing correlation from mid 2015 to the first quarter of 2016. The following break-in is not reflected in the trading performance, which would be interesting to investigate further. Afterwards the correlation seems to be too weak to drive a further growth in the trading performance and a horizontal movement started. In the mid of 2017 the sliding correlation and the trading performance raise again.

The overall day performance correlation is rather small:

Conclusion

The evaluation of the six companies showed that their stock prices are strongly correlated. The day performances on the succeeding stock exchange can correlate but on a lower level and in a more volatile way. Additionally, there are distinct differences between the companies and diverging time spans. The application of a simplified and idealistic trading model showed that the day performance correlations might have the potential to generate remarkable profit.
A closer look revealed that the performance of the trading model is linked to the relative correlation strength.

The investigation and application of more enhanced methods to predict the current correlation level could yield to a more performant and reliable trading model.

The Shiny application with which the evaluations were carried out can be found hereย (please note that you have to press the โ€œRefreshโ€ button to update the charts.)

The code can be found here.

About Author

Stefan Hainzer

Stefan holds an MS in Computer Science and has 17 years of experience as software engineer in different areas. The last 8 years he focused on Microsoft SQL Server, data warehousing and business intelligence. To enhance his professional...
View all posts by Stefan Hainzer >

Related Articles

Leave a Comment

No comments found.

View Posts by Categories


Our Recent Popular Posts


View Posts by Tags

#python #trainwithnycdsa 2019 2020 Revenue 3-points agriculture air quality airbnb airline alcohol Alex Baransky algorithm alumni Alumni Interview Alumni Reviews Alumni Spotlight alumni story Alumnus ames dataset ames housing dataset apartment rent API Application artist aws bank loans beautiful soup Best Bootcamp Best Data Science 2019 Best Data Science Bootcamp Best Data Science Bootcamp 2020 Best Ranked Big Data Book Launch Book-Signing bootcamp Bootcamp Alumni Bootcamp Prep boston safety Bundles cake recipe California Cancer Research capstone car price Career Career Day citibike classic cars classpass clustering Coding Course Demo Course Report covid 19 credit credit card crime frequency crops D3.js data data analysis Data Analyst data analytics data for tripadvisor reviews data science Data Science Academy Data Science Bootcamp Data science jobs Data Science Reviews Data Scientist Data Scientist Jobs data visualization database Deep Learning Demo Day Discount disney dplyr drug data e-commerce economy employee employee burnout employer networking environment feature engineering Finance Financial Data Science fitness studio Flask flight delay gbm Get Hired ggplot2 googleVis H20 Hadoop hallmark holiday movie happiness healthcare frauds higgs boson Hiring hiring partner events Hiring Partners hotels housing housing data housing predictions housing price hy-vee Income Industry Experts Injuries Instructor Blog Instructor Interview insurance italki Job Job Placement Jobs Jon Krohn JP Morgan Chase Kaggle Kickstarter las vegas airport lasso regression Lead Data Scienctist Lead Data Scientist leaflet league linear regression Logistic Regression machine learning Maps market matplotlib Medical Research Meet the team meetup methal health miami beach movie music Napoli NBA netflix Networking neural network Neural networks New Courses NHL nlp NYC NYC Data Science nyc data science academy NYC Open Data nyc property NYCDSA NYCDSA Alumni Online Online Bootcamp Online Training Open Data painter pandas Part-time performance phoenix pollutants Portfolio Development precision measurement prediction Prework Programming public safety PwC python Python Data Analysis python machine learning python scrapy python web scraping python webscraping Python Workshop R R Data Analysis R language R Programming R Shiny r studio R Visualization R Workshop R-bloggers random forest Ranking recommendation recommendation system regression Remote remote data science bootcamp Scrapy scrapy visualization seaborn seafood type Selenium sentiment analysis sentiment classification Shiny Shiny Dashboard Spark Special Special Summer Sports statistics streaming Student Interview Student Showcase SVM Switchup Tableau teachers team team performance TensorFlow Testimonial tf-idf Top Data Science Bootcamp Top manufacturing companies Transfers tweets twitter videos visualization wallstreet wallstreetbets web scraping Weekend Course What to expect whiskey whiskeyadvocate wildfire word cloud word2vec XGBoost yelp youtube trending ZORI