Elyanah: Data Tools for Elixir

Posted on Mar 30, 2016

Contributed by Tom Welsh. Tom is currently in the NYC Data Science Academy 12 week full time Data Science Bootcamp program taking place between January 11th to April 1st, 2016. This post is based on his capstone project  (due on the 12th week of the program).

Building a neural network in Elixir was fun, but it wasn't incredibly useful.  My proof-of-concept wasn't really flexible enough to be useful in real-world applications.  I decided, instead, to start at the bottom, building the basic mathematical foundations for what I hope will eventually become a a useful library for data analysis in Elixir.  I've named my future library Elyanah, after my friend's new daughter, and the first module of it is Elyanah.Numeric.

I chose to begin with basic linear algebra operation on arrays and matrices. I felt that it fit more with my understanding of the Elixir approach (which is focused on transforming data) to work on standard Elixir Lists (for arrays) and Lists of Lists (for matrices), rather than defining new structs.

My first function was the array dot product:

  def dot(a, b, dims \\ :trunc) do
    zip(a, b, dims)
    |> Enum.reduce(0, fn {a,b}, acc -> acc + a * b end)
  end

The zip function called here is not the standard Enum.zip or Stream.zip. I wanted to support some form of NumPy-style data broadcasting, at least optionally. This functionality is supported by my zip function. When dims is set to :cycle the data broadcasts:

  def zip(a, b, dims \\ :trunc)
  def zip(a, b, :trunc), do: Stream.zip(a,b)
  def zip(a, b, _) when length(a) == length(b), do: Stream.zip(a,b)
  def zip(a, b, :strict) when length(a) != length(b) do
    raise ArithmeticError, message: "Array dimensions do not match."
  end
  def zip(a, b, :cycle) when length(a) > length(b) do
    Stream.zip(a, Stream.cycle(b))
  end
  def zip(a, b, :cycle) when length(a) < length(b) do
    Stream.zip(Stream.cycle(a), b)
  end

I also implemented element-wise addition, subtraction, multiplication, and division on arrays. The documentation is available here.

Next, I moved on to matrix methods. Here, for example, is multiplication, which leverages the dot product method we saw earlier. It handles multiplication of matrices by other matrices, but also by arrays and scalars.

  def multiply([[_|_]|_] = a, [[_|_]=h|_] = b) do
    for aa <- a, bb  Enum.chunk(length(h))
  end
  def multiply([_|_] = a, [[_|_]|_] = b), do: multiply([a], b)
  def multiply([[_|_]|_] = a, [_|_] = b), do: multiply(a, transpose([b]))
  def multiply(a, [[_|_]|_] = b) do
    Enum.map(b, &(Array.multiply(a, &1)))
  end
  def multiply([[_|_]|_] = a, b), do: multiply(b, a)

I implemented many more matrix methods. The full list is available in the documentation.

To make these really convenient to use, I implemented several infix operators. The implementations are designed to match on matrices and arrays and call the appropriate methods, while otherwise falling back into their default implementations. Since the . operator in Elixir is a special form that cannot have its behavior overridden, for the dot product, I had to use the <|> operator, which is part of a set of infix operators in Elixir that have been left open for custom implementations.

The infix operators allow us to do things like this:

  iex> use Elyanah.Numeric
  nil
  iex> [[1,2],[3,4],[5,6]] * [[1,2,3],[4,5,6]]
  [[9, 12, 15], [19, 26, 33], [29, 40, 51]]

The implementation of the * operator is as follows:

  def ([[_|_]|_] = a) * b, do: Matrix.multiply(a, b)
  def a * ([[_|_]|_] = b), do: Matrix.multiply(a, b)
  def ([_|_] = a) * b, do: Array.multiply(a, b)
  def a * ([_|_] = b), do: Array.multiply(a, b)
  def a * b, do: Kernel.*(a,b)

When defining these operators, we must make sure the built-in versions are not loaded as well. This is the method that gets invoked when we use the Elyanah.Numeric module:

  defmacro __using__(_opts) do
    quote do
      import Kernel, except: [*: 2, +: 2, -: 2, /: 2]
      import Elyanah.Numeric
    end
  end

The full list of infix operators implemented for matrices and arrays is available here.

While there's certainly many more useful linear algebra operations that can be defined, my next goal is to implement some sort of DataFrame module.

Links to documentation and source code can be found on the Elyanah homepage.

About Author

Tom Walsh

Tom Walsh (M.Sc. Computer Science, University of Toronto) developed a desire to get deeper into the data while leading a team of developers at BSports building Scouting Information Systems for Major League Baseball teams. A course on Basketball...
View all posts by Tom Walsh >

Related Articles

Leave a Comment

Valeriesix February 24, 2017
Excellent goods from you, man. I've keep in mind your stuff prior to and you are just too fantastic. I actually like what you've bought here, really like what you are stating and the best way wherein you are saying it. You're making it enjoyable and you continue to care for to keep it wise. I cant wait to read much more from you. This is really a wonderful web site. business advice www.dealhint.eu
cartier bangle men copy December 11, 2016
cartierlovejesduas This is crazy talk. It is very tempting for people who are of the collectivist leaning (an already lazy disposition) to group whole peoples of faith together. cartier bangle men copy http://courtshipgift.com/
bvlgari bijoux rose prix August 19, 2016
http://www.vancleefalhambra.com/cheap-vintage-alhambra-long-necklace-white-mother-of-pearl-vcara42100-p217.html http://www.vancleefalhambra.com/cheap-vintage-alhambra-ring-carnelian-vcard40800-p332.html http://www.vancleefalhambra.com/cheap-vintage-alhambra-ring-diamond-vcara40900-p333.html http://www.vancleefalhambra.com/cheap-vintage-alhambra-ring-diamond-vcarf48900-p334.html http://www.vancleefalhambra.com/cheap-vintage-alhambra-ring-onyx-vcara41000-p335.html http://www.vancleefalhambra.com/cheap-vintage-alhambra-earclips-carnelian-vcard40400-p306.html http://www.vancleefalhambra.com/cheap-sweet-alhambra-earstuds-heart-carnelian-vcarn6bp00-p305.html http://www.vancleefalhambra.com/cheap-sweet-alhambra-earstuds-clover-carnelian-vcarn6bo00-p304.html http://www.vancleefalhambra.com/cheap-sweet-alhambra-earstuds-butterfly-white-mother-of-pearl-vcarn5jm00-p303.html http://www.vancleefalhambra.com/cheap-vintage-alhambra-necklace-10-motifs-yellow-gold-vcara42800-p220.html http://www.vancleefalhambra.com/cheap-vintage-alhambra-long-necklace-vcara43100-p221.html http://www.vancleefalhambra.com/cheap-vintage-alhambra-earclips-turquoise-vcarb84100-p309.html http://www.vancleefalhambra.com/cheap-vintage-alhambra-earclips-turquoise-vcara44400-p308.html http://www.vancleefalhambra.com/cheap-vintage-alhambra-necklace-10-motifs-vcaro3qj00-p230.html http://www.vancleefalhambra.com/cheap-vintage-alhambra-long-necklace-malachite-vcarl88100-p229.html http://www.vancleefalhambra.com/cheap-vintage-alhambra-long-necklace-white-mother-of-pearl-vcarf48800-p228.html http://www.vancleefalhambra.com/cheap-vintage-alhambra-necklace-10-motifs-vcarf48500-p227.html http://www.vancleefalhambra.com/cheap-vintage-alhambra-necklace-10-motifs-vcard40600-p226.html http://www.vancleefalhambra.com/cheap-vintage-alhambra-long-necklace-carnelian-vcard39800-p225.html bvlgari bijoux rose prix http://www.b01bijoux.cn/fr/bvlgari-bzero1-4-bandes-bague-en-or-jaune-p-248.html
bracelet or blanc cartier Replique July 7, 2016
cartierbraceletlove I just visited Ostia Antica near Rome, which is similar to Pompeii and Ephesus. It was very cool and I can’t wait to write about it. Have never been to Petra, but it’s on the list! bracelet or blanc cartier Replique http://www.fashionlovebangle.com/fr/top-qualit%C3%A9-cartier-love-or-blanc-bracelet-serti-de-4-diamants-b6035816-p-259.html
Chelsey July 7, 2016
WOW just what I wwas lookingg for. Came here by searching for click here
cartier tank diamants July 5, 2016
Esta película merece un remake con algo más de acción. cartier tank diamants http://www.luxemontre.com/
bagues van cleef & arpels prix replique June 29, 2016
cartierbraceletlove Thanks for telling me about your blog. I will definitely check it out. Thanks! bagues van cleef & arpels prix replique http://www.collanaqualitagioielli.cn/fr/

View Posts by Categories


Our Recent Popular Posts


View Posts by Tags

#python #trainwithnycdsa 2019 2020 Revenue 3-points agriculture air quality airbnb airline alcohol Alex Baransky algorithm alumni Alumni Interview Alumni Reviews Alumni Spotlight alumni story Alumnus ames dataset ames housing dataset apartment rent API Application artist aws bank loans beautiful soup Best Bootcamp Best Data Science 2019 Best Data Science Bootcamp Best Data Science Bootcamp 2020 Best Ranked Big Data Book Launch Book-Signing bootcamp Bootcamp Alumni Bootcamp Prep boston safety Bundles cake recipe California Cancer Research capstone car price Career Career Day citibike classic cars classpass clustering Coding Course Demo Course Report covid 19 credit credit card crime frequency crops D3.js data data analysis Data Analyst data analytics data for tripadvisor reviews data science Data Science Academy Data Science Bootcamp Data science jobs Data Science Reviews Data Scientist Data Scientist Jobs data visualization database Deep Learning Demo Day Discount disney dplyr drug data e-commerce economy employee employee burnout employer networking environment feature engineering Finance Financial Data Science fitness studio Flask flight delay gbm Get Hired ggplot2 googleVis H20 Hadoop hallmark holiday movie happiness healthcare frauds higgs boson Hiring hiring partner events Hiring Partners hotels housing housing data housing predictions housing price hy-vee Income Industry Experts Injuries Instructor Blog Instructor Interview insurance italki Job Job Placement Jobs Jon Krohn JP Morgan Chase Kaggle Kickstarter las vegas airport lasso regression Lead Data Scienctist Lead Data Scientist leaflet league linear regression Logistic Regression machine learning Maps market matplotlib Medical Research Meet the team meetup methal health miami beach movie music Napoli NBA netflix Networking neural network Neural networks New Courses NHL nlp NYC NYC Data Science nyc data science academy NYC Open Data nyc property NYCDSA NYCDSA Alumni Online Online Bootcamp Online Training Open Data painter pandas Part-time performance phoenix pollutants Portfolio Development precision measurement prediction Prework Programming public safety PwC python Python Data Analysis python machine learning python scrapy python web scraping python webscraping Python Workshop R R Data Analysis R language R Programming R Shiny r studio R Visualization R Workshop R-bloggers random forest Ranking recommendation recommendation system regression Remote remote data science bootcamp Scrapy scrapy visualization seaborn seafood type Selenium sentiment analysis sentiment classification Shiny Shiny Dashboard Spark Special Special Summer Sports statistics streaming Student Interview Student Showcase SVM Switchup Tableau teachers team team performance TensorFlow Testimonial tf-idf Top Data Science Bootcamp Top manufacturing companies Transfers tweets twitter videos visualization wallstreet wallstreetbets web scraping Weekend Course What to expect whiskey whiskeyadvocate wildfire word cloud word2vec XGBoost yelp youtube trending ZORI