Data What Interested Us the Week of February 16, 2015

Posted on Feb 21, 2015

Project GitHub | LinkedIn:   Niki   Moritz   Hao-Wei   Matthew   Oren

The skills we demoed here can be learned through taking Data Science with Machine Learning bootcamp with NYC Data Science Academy.

Big Data: Telling The Story Of Falling Oil Prices

Jiayu Peng, 2/20/15

The fall of Oil prices is one of the most prominent topics in our world nowadays, so it's a matter of curiosity for any data enthusiast to see what Big Data can tell us about the Oil market's scene.

Recently, the Rebaie Analytics Group analyzed thousands of news article mentioning the Oil & Gas discussions before and after the fall of Oil prices in a 6 months’ time frame. They used the GDELT data which monitors the world's broadcast, print, and web news around the world.

Based on these data, the research group constructed a network diagram that shows the “communities” of conversation around “Oil & Gas”. Using data-mining techniques in networks, they identified significant influencers on the oil price, and the people with whom they are most closely connected. These results provide valuable insights and evidence for further interpretations.

Reference:
Big Data: Telling the Story of Falling Prices

IBM, G.E. and Others Create Big Data Alliance

Jiayu Peng, 2/20/15

A key element of the big data business is getting what much of computer technology secretly craves: Normality.

A new big data alliance, named the "Open Data Platform", has formed around developing products based on a common core of Hadoop's key components. The members of this alliance,including GE, Hortonworks, IBM, Infosys, Pivotal, SAS, announced a common set of standards for Hadoop.

Hadoop is perhaps the most widespread framework for distributing, managing and processing big data. However, the technology has been somewhat difficult to use, and there are concerns that deepening uses of different kinds of Hadoop, even with slight variations, could slow down the market. Therefore, it is really beneficial that big companies have teamed up and signed on common standards for Hadoop.

Reference:

IBM GE and Others Create Big Data Alliance

Tech Companies Unite Open Data Platform

 

Title: Oracle's new products aim to combine big data from multiple sources

Jiayu Peng, 2/20/15

Oracle announced four new products on Thursday, targeting one of the core challenges in big data efforts: combining data from multiple sources.

Oracle Big Data Discovery, for example, is designed to serve as the "visual face of Hadoop" for business users. With an interface intended to offer an experience as familiar as shopping online, it lets users not just find and explore data from across multiple sources but also analyze it and share the results, all from a single tool.

Another new product is called "GoldenGate for Big Data", a Hadoop-based tool that allows users to stream real-time, unstructured data from heterogeneous transactional systems into big-data systems including Apache Hadoop, Apache Hive, Apache HBase and Apache Flume.

"Oracle gives customers an integrated platform that helps simplify access to all their data, discover new insights, predict outcomes in real time, and keep all their data governed and secure," said Neil Mendelson, vice president of big data at Oracle.

Reference:
Oracle Steps Us its Big Data Push with New Products

 

Title: Internet of DNA: medicine’s next great advance

Jiayu Peng, 2/20/15

In January, programmers in Toronto began testing a system for trading genetic information with other hospitals. These facilities, in locations including Miami, Baltimore, and Cambridge, U.K., also treat children with so-called ­Mendelian disorders, which are caused by a rare mutation in a single gene. The system, called MatchMaker Exchange, represents something new: a way to automate the comparison of DNA from sick people around the world.

The communication between DNA databases is definitely beneficial. If a global network of millions of genomes were established, everyone's medical treatment would benefit from the experiences of millions of others. However, technical issues prevent sharing genomic data around the web, for example, there are no standard protocols, application programming interfaces (APIs), and file formats for DNA.

Fortunately, scientists are targeting these issues, and the MatchMaker Exchange system is a breakthrough. If successfully built, the Internet of DNA could be medicine’s next great advance.

Reference:
Internet of DNA

Governments Must Embrace IoT for Smart Cities

Bob Violino,  2/18/15

Mr. Violino's biggest message is stated in a quote from Ruthbea Yesner Clarke, director, Smart Cities Strategies program, at ID Government Insights at the end of his article.  "The Internet of Things is an emerging reality, and U.S. cities and states cannot avoid the ramifications of new IP-enabled and connected devices and their potential impact on the delivery of government services and on the quality of life of citizens," Ruthbea Yesner Clarke, director, Smart Cities Strategies program.
The idea of smart cities is to improve the quality of life in a number of areas.  Smart cities can mean less traffic, better EMS response, reduce greenhouse gas emissions, and generally service the community better.  This would include a combination of strategies, including cloud, mobile, social networks and big data/analytics.  Cities can realize a return on their investment in terms of lowered costs as well as increase in public good.
The stumbling block is more related to lack of awareness of what IoT can mean for a city and lack of experience in this area.  "Many department leaders have specific problems they’d like to solve that would be a fit for IoT, but they’re not clear on what IoT means in practical terms, what are specific use cases, and what other cities have already tested and tried", Mr. Violino writes.  So IoT is coming, just will take a while to catch on.

About Author

Related Articles

Leave a Comment

No comments found.

View Posts by Categories


Our Recent Popular Posts


View Posts by Tags

#python #trainwithnycdsa 2019 2020 Revenue 3-points agriculture air quality airbnb airline alcohol Alex Baransky algorithm alumni Alumni Interview Alumni Reviews Alumni Spotlight alumni story Alumnus ames dataset ames housing dataset apartment rent API Application artist aws bank loans beautiful soup Best Bootcamp Best Data Science 2019 Best Data Science Bootcamp Best Data Science Bootcamp 2020 Best Ranked Big Data Book Launch Book-Signing bootcamp Bootcamp Alumni Bootcamp Prep boston safety Bundles cake recipe California Cancer Research capstone car price Career Career Day citibike classic cars classpass clustering Coding Course Demo Course Report covid 19 credit credit card crime frequency crops D3.js data data analysis Data Analyst data analytics data for tripadvisor reviews data science Data Science Academy Data Science Bootcamp Data science jobs Data Science Reviews Data Scientist Data Scientist Jobs data visualization database Deep Learning Demo Day Discount disney dplyr drug data e-commerce economy employee employee burnout employer networking environment feature engineering Finance Financial Data Science fitness studio Flask flight delay gbm Get Hired ggplot2 googleVis H20 Hadoop hallmark holiday movie happiness healthcare frauds higgs boson Hiring hiring partner events Hiring Partners hotels housing housing data housing predictions housing price hy-vee Income Industry Experts Injuries Instructor Blog Instructor Interview insurance italki Job Job Placement Jobs Jon Krohn JP Morgan Chase Kaggle Kickstarter las vegas airport lasso regression Lead Data Scienctist Lead Data Scientist leaflet league linear regression Logistic Regression machine learning Maps market matplotlib Medical Research Meet the team meetup methal health miami beach movie music Napoli NBA netflix Networking neural network Neural networks New Courses NHL nlp NYC NYC Data Science nyc data science academy NYC Open Data nyc property NYCDSA NYCDSA Alumni Online Online Bootcamp Online Training Open Data painter pandas Part-time performance phoenix pollutants Portfolio Development precision measurement prediction Prework Programming public safety PwC python Python Data Analysis python machine learning python scrapy python web scraping python webscraping Python Workshop R R Data Analysis R language R Programming R Shiny r studio R Visualization R Workshop R-bloggers random forest Ranking recommendation recommendation system regression Remote remote data science bootcamp Scrapy scrapy visualization seaborn seafood type Selenium sentiment analysis sentiment classification Shiny Shiny Dashboard Spark Special Special Summer Sports statistics streaming Student Interview Student Showcase SVM Switchup Tableau teachers team team performance TensorFlow Testimonial tf-idf Top Data Science Bootcamp Top manufacturing companies Transfers tweets twitter videos visualization wallstreet wallstreetbets web scraping Weekend Course What to expect whiskey whiskeyadvocate wildfire word cloud word2vec XGBoost yelp youtube trending ZORI