Data Study on Mutual Fund Correlations

Posted on Nov 17, 2016
The skills we demoed here can be learned through taking Data Science with Machine Learning bootcamp with NYC Data Science Academy.


We live in a world filled with investments.  For the past few years, investment products have been created at lightning speed. From stocks to mutual funds to ETFs,  data shows the financial industry has been aggressively promoting their products, thereby flooding the public with those investment ideas created to aid with issues from college funding to retirement saving.  Due to the explosive growth in choices, it has become increasingly difficult for the public to make a decision on which investment vehicle is appropriate for them.

In this study, we explored and examined one of the major investment avenues: mutual funds.  The funds we selected are from the Vanguard family, one of the biggest mutual fund investment firms in the United States.  The goal of the study was to investigate the performances of some very popular funds from the correlation perspective and to try to understand how those funds might play a role in diversifying the risk when investors build their portfolios.


The Data and Method

The mutual funds we chose were divided into three groups: Market Index fund, Growth, Balance and Dividend funds, Sector funds, and Bond funds.  Simply put, all the funds are domestic and from the Vanguard family.

Here are the names and brief description of those funds:

VFINX     --  S&P 500 index Fund
VMRGX   --  large- and mid- cap growth of US companies
VDEQX    --  focus on growth and income and invest on other funds from Vanguard family
VDIGX     --  Dividend Growth fund
VGENX    --  energy fund which focus on the energy industry
VGHCX    --  healthcare fund which covers health care, health management and drug industry
VGSLX     --  REIT real estate trust fund
VBLTX     --  long-term bond index fund
VBIIX       --  intermediate-term bond index fund
VBISX      --  short-term bond index fund

The monthly, quarterly and annual returns have been collected from Yahoo finance and MorningStar using web scraping techniques implemented in Python.

The period this project focuses on is between 2006 and 2016.

The correlations are calculated using rolling windows with a width of 12 months, 4 quarters and 4 years.


The Data Findings

The baseline of the study is VFINX, the S&P 500 index mutual fund, which will show fund return and correlation with it.
The first fund is the Vanguard Morgan Growth Fund VMRGX. This fund seeks long-term growth of capital and invests mainly in the stocks of mid- and large-capitalization U.S. companies whose revenues and/or earnings are expected to grow faster than those of average companies in the market.

Data Study on Mutual Fund Correlations        Data Study on Mutual Fund Correlations

Data Study on Mutual Fund Correlations

The plots show that the mean returns of the VMRGX fund increased by 0.89%, 3.3% and 11.8% on a monthly, quarterly and annual basis and that the standard deviations increased by 13.9%, 8.5% and 14.3% in corresponding periods, compared to the S&P index fund VFINX. The correlations were above 90% most of the time, the only exceptions being a couple of periods measured on a monthly or quarterly basis.

The fund seems to overshoot the market on tops and bottoms when the market makes a turn. Overall, VMRGX is highly correlated with the board market during the period. However, without significant performance on return, it charged 2.5 times the fee VFINX charged. Investors might be better off by just investing in the S&P 500 Index Fund.

Vanguard Dividend Growth Fund

The second fund is the Vanguard Dividend Growth Fund VDIGX.

vdigxm       vdigxq



The primary goal of the VDIGX fund is to provide a growing stream of income over time. The plots above show the standard deviations are about 11%, 21% and 24% lower compared to the S&P Index fund on a monthly, quarterly and annual basis. This reflects the low volatility compared to the growth and S&P index funds.

Although its monthly mean return is slightly lower than VFINX, it manages to increase its quarterly and annual mean return and, by the end of the period, its annual average return reaches 5% higher than market index fund. During the study period, its annual performance shows negative correlation with the market index fund.  Based on its performance and low volatity, VDIGX would make a good candidate for portfolio diversification.

Vanguard Diversified Equity Fund

Vanguard Diversified Equity Fund, VDEQX, seeks long-term capital appreciation and dividend income. This fund invests in a diversified group of other Vanguard equity mutual funds, rather than in individual securities.

vdeqxm      vdeqxq


The plots shows the fund is highly correlated with the S&P Index fund most of the time on three time frames. The tables show that the mean returns are very close to the index fund as well. However, it has a 9.9%, 8.5% and 11.2% higher standard deviation than VFINX on a monthly, quarterly and annual basis. Additionally, it charges 2.5 times the fee VFINX charges. Investors would be better off sticking with VFINX instead.

Vanguard Health Care Fund

Vanguard Health Care Fund, VGHCX, seeks long-term capital appreciation and 80% of its assets are invested in health care industry.

vghcxm    vghcxaa


The plots above show its average returns are 23.2%, 36.3% and 52.5% higher than VFINX and its standard deviations varies widely, indicating that it is highly volatile compared to VFINX.  The annual correlation with VFINX is the lowest compared to other time frames, which indicates that it might be a good candidate for long-term holding.

Vanguard REIT Index Fund

Vanguard REIT Index fund, VGSLX,  seeks to provide a high level of income and moderate long-term capital appreciation by tracking the performance of a benchmark index that measures the performance of publicly traded equity REITs.  Compared to the broad market fund,  the fun does result in different patterns, as shown below.

vgslxm      vgslxq


The fund has a higher standard deviation than the market index fund, VFINX. And, it is not highly correlated with it either. Together with a lower fee, it could be a potential candidate for investors to add into his/her portfolio.

Vanguard Long-Term Bond Index Fund

Vanguard Long-Term Bond Index Fund, VBLTX, seeks to track the performance of a market-weighted bond index with a long-term dollar-weight average maturity. And, Vanguard Interm-Term Bond Index Fund, VBIIX, seeks the performance of a market-weighted bond index with an intermediate-term dollar-weighted average maturity.

bondliq      bondlia

The plots above show that the long-term bond have a slightly higher average return and standard deviation, indicating that there is a high volatility. This observation is in agreement with its long-term nature.

Vanguard Short-Term Bond Index Fund

For Vanguard Short-term Bond Index Fund, VBISX, which seeks to track the performance of a market-weighted bond index with a short-term dollar-weighted average maturity, its average return and standard deviation are much lower than those of the long-term index Bond.

bondlsq   bondlsa

The Fed policy over the recent years has had a significant impact on the return of short-term bond.


1. Returns of growth and balance funds are highly correlated with the S&P 500 index fund, but are also associated with higher fees.

2. Returns of Specialty funds have a lower correlation with the index fund and other growth or balance funds.

3. Dividend fund shows its unique characteristics and can be a potential candidate for diversifying the risk in an investor's portfolio.

4. REIT fund also shows its unique features and can be a candidate for an investor's long-term holding.

5. The short-term bond fund has a lower correlation with the long-term bond fund compared to the intermediate-term bond fund and has been significantly impacted by the Federal interest rate policy.

6. The index fund might have a chance to outperform those growth and balance funds over time if you take into account the fees those funds charge.


About Author

Connie Zhang

Connie Zhang, a marketing specialist, has been working in the field of data analysis since 2010. She holds a Ph.D. in Engineering,MBA and an Associateship of the Society of Actuary in United States.
View all posts by Connie Zhang >

Related Articles

Leave a Comment

game of thrones s07e02 ettv August 18, 2017
I really appreciate the writing. For more interesting info about Game of Thrones Season 7 check this out game of thrones s07e07 the pirate bay August 16, 2017
I would be thrilled to see Tyrion and Jon stealing two dragons for themselves and take them away from Dany :P LOL. Game of Thrones S07E07 Streaming

View Posts by Categories

Our Recent Popular Posts

View Posts by Tags

#python #trainwithnycdsa 2019 2020 Revenue 3-points agriculture air quality airbnb airline alcohol Alex Baransky algorithm alumni Alumni Interview Alumni Reviews Alumni Spotlight alumni story Alumnus ames dataset ames housing dataset apartment rent API Application artist aws bank loans beautiful soup Best Bootcamp Best Data Science 2019 Best Data Science Bootcamp Best Data Science Bootcamp 2020 Best Ranked Big Data Book Launch Book-Signing bootcamp Bootcamp Alumni Bootcamp Prep boston safety Bundles cake recipe California Cancer Research capstone car price Career Career Day citibike classic cars classpass clustering Coding Course Demo Course Report covid 19 credit credit card crime frequency crops D3.js data data analysis Data Analyst data analytics data for tripadvisor reviews data science Data Science Academy Data Science Bootcamp Data science jobs Data Science Reviews Data Scientist Data Scientist Jobs data visualization database Deep Learning Demo Day Discount disney dplyr drug data e-commerce economy employee employee burnout employer networking environment feature engineering Finance Financial Data Science fitness studio Flask flight delay gbm Get Hired ggplot2 googleVis H20 Hadoop hallmark holiday movie happiness healthcare frauds higgs boson Hiring hiring partner events Hiring Partners hotels housing housing data housing predictions housing price hy-vee Income Industry Experts Injuries Instructor Blog Instructor Interview insurance italki Job Job Placement Jobs Jon Krohn JP Morgan Chase Kaggle Kickstarter las vegas airport lasso regression Lead Data Scienctist Lead Data Scientist leaflet league linear regression Logistic Regression machine learning Maps market matplotlib Medical Research Meet the team meetup methal health miami beach movie music Napoli NBA netflix Networking neural network Neural networks New Courses NHL nlp NYC NYC Data Science nyc data science academy NYC Open Data nyc property NYCDSA NYCDSA Alumni Online Online Bootcamp Online Training Open Data painter pandas Part-time performance phoenix pollutants Portfolio Development precision measurement prediction Prework Programming public safety PwC python Python Data Analysis python machine learning python scrapy python web scraping python webscraping Python Workshop R R Data Analysis R language R Programming R Shiny r studio R Visualization R Workshop R-bloggers random forest Ranking recommendation recommendation system regression Remote remote data science bootcamp Scrapy scrapy visualization seaborn seafood type Selenium sentiment analysis sentiment classification Shiny Shiny Dashboard Spark Special Special Summer Sports statistics streaming Student Interview Student Showcase SVM Switchup Tableau teachers team team performance TensorFlow Testimonial tf-idf Top Data Science Bootcamp Top manufacturing companies Transfers tweets twitter videos visualization wallstreet wallstreetbets web scraping Weekend Course What to expect whiskey whiskeyadvocate wildfire word cloud word2vec XGBoost yelp youtube trending ZORI