Data Science On The Go with Docker and Raspberry Pi
Overview & Scope
The image on the left is a Python notebook running in Jupyter Labs. The image on the right is a Python 3 Jupyter Notebook, but it is limited to versions of Python only. Juypyter Labs is an early stage version of the next generation Notebook program from Jupyter, and it is designed to support many programming languages, including both Python and R.
The goal of this project is to configure and setup a Raspberry Pi computer as a self contained DataScience development platform with a web interface usable by nearly any computer/tablet/etc with a web browser.
The second focus is ease of setup. With the exception of the MySQL server, the rest of the software for Python & R has been built into a custom ARM compatible docker image. The decision to use a docker image facilitates scalability as it makes it very easy to deploy the software to one or 100+ Raspberry Pi computers.
The first component is the database server. I chose to include MySQL because it is well supported on the ARM architecture, it ties into R and Python very easily, and it is well supported with phpmyadmin's web interface. Next are the programming components, R and Python. I have grown very comfortable using Jupyter Notebooks to process datasets in Python, and perform exploratory data analysis/visualizations, so I looked for a similar option that would work with both Python and R.
Jupyter Lab seems to fit this need rather well. It is a similar product to Jupyter Notebook - a web based notebook programming environment, but it extends well beyond Jupyter Notebook's limited scope by allowing inclusion of additional programming languages in the same 'notebook' environment.
Initial Configuration
Depending on the Raspberry-Pi model that is used, there are a few configuration points of note.
The Raspberry Pi 3 can either be connected to the network with a standard CAT5/RJ45 ethernet cable, or it can connect to the network over Wifi if available.
For the Raspberry Pi Zero and Raspberry Pi Zero W there is the option to use a single USB cable for data and power. The Pi Zero models support a feature called USB Gadget mode, this allows the Raspberry Pi Zero (and Zero W) to simulate an ethernet connection through USB (or serial connection, or USB mass storage device, or a USB keyboard, or USB webcam...)
Enabling ethernet over USB on the Pi Zero takes a few steps.
After flashing your microsd card with the latest version of Raspian Linux, we'll need to edit a few files. On my Mac, they are mounted to the directory /Volumes/
- First edit the /boot/config.txt file
- add "
dtoverlay=dwc2
" to the bottom of the file.
- add "
- next edit the /boot/cmdline.txt file and replace it with:
dwc_otg.lpm_enable=0 console=serial0,115200 console=tty1 \
root=PARTUUID=cd0b7f54-02 rootfstype=ext4 elevator=deadline \
fsck.repair=yes rootwait modules-load=dwc2,g_ether g_ether.dev_addr=12:23:33:43:53:66 \
g_ether.host_addr=12:33:44:55:66:77
- Lastly, create an empty file 'ssh' (leave out any file extension!)
touch /Volumes/boot/ssh
Before we connect the Pi Zero through USB to a computer, we'll need to change some network settings first.
On OS X, open Settings, then Sharing, and enable Internet sharing for RNDIS/Ethernet Gadget
Now we can plug in our Raspberry Pi, but be sure to use the USB port in the middle, as the one on the edge is for power only. Then, in the Network section of Settings we'll need to adjust the settings for the new RNDIS/Gadget
connection that pops up. Set the network to 'DHCP with manual address' and select 192.168.137.1
for the ip address. For some reason, my Pi Zero picks up an ip in the 192.168.2.x
range, but it has full internet access. Keep in mind, without configuring an additional network bridge, the other devices in your network will not be able to connect to the Pi Zero using ethernet over USB.
As you can see, the Pi Zero with hostname - rpi connected over USB with ip address 192.168.2.15
and is able to ping google.com, verifying the internet is accessible. The chrome browser has successfully connected to Jupyter lab running on the rpi.local
over USB.
Installation
Note, you will need to install the latest version of Raspian linux, and run sudo raspi-config
to change the default hostname from rasperrypi.local, as well as adding the appropriate localization settings, limiting gpu memory to 16mb, and enabling the ssh server (unless you added a file titled ssh in the /boot directory after burning the disk image onto the microsd card).
Connect to your Raspberry Pi over ssh:
ssh pi@raspberrypi.local
The above command will work unless you changed the username or hostname, then edit those fields accordingly. Alternatively, you can pass the ip address in lieu of the hostname, though I would advise against using the ip address.
Here are generic version of the ssh command:
ssh user@hostname
ssh user@ip
- eg:
ssh pi@192.168.2.10
*Note, this step is optional if you don't need a database server.*
First Update apt-get and install mysql-server
sudo apt-get update && sudo apt-get install -y mysql-server mysql-client libmysqlclient15-dev
Next, you'll need to add an administrative password for the 'root' user in a grey/blue prompt.
For Python access install python-mysqldb and SQLAlchemy
sudo apt-get install -y python-mysqldb && pip install -U SQLAlchemy
Next, edit the my.cnf located in /etc/mysql and change line 43 by adding a '#' before 'bind-address = 127.0.0.1' to allow the MySQL server to accept incomming connections from other computers on the network.
sudo nano /etc/mysql/my.cnf
If everything was successful then we're ready to install PHPMyAdmin
Install the following required dependencies
sudo apt-get install apache2 php5 libapache2-mod-php5
Finally we can install phpmyadmin
sudo apt-get install phpmyadmin
You'll get to see another blue/grey screen asking for you to continue, and one asking which web server to use. Given the choice between Apache2 and lighttpd, I suggest selecting Apache2.
Once prompted for a password, enter the one previously entered during the MySQL setup.
Next, we'll need to edit Apache's configuration file to include phpmyadmin. Add
Include /etc/phpmyadmin/apache.conf to the end of apache.conf located:
/etc/apache2/apache.conf
You can either edit the file with nano or another text editor.
sudo nano /etc/apache2/apache.conf
Save and exit by pressing CTRL + x, then y, then enter.
Restart Apache:
sudo service /etc/initi.d/apache2 restart
Now we can access the phpmyadmin web interface for the MySQL server from any computer connected to the same network as the Raspberry Pi. You can use the Raspberry Pi's hostname or ip address in the url below:
http://<hostname>/phpmyadmin - eg: http://raspberrypi.local/phpmyadmin
http://<ip>/phpmyadmin - eg http://192.168.1.5/phpmyadmin
Here is the main login page for phpmyadmin - Note the URL: http://rpi.local/phpmyadmin
Keep in mind, you can adjust the hostname from the default raspberrypi by running
sudo raspi-config
And selecting hostname from the main menu. Also note the tld (top level domain - aka web address suffix, eg: .com, .net, .co.uk, .us, .cn...) of .local - that is the default behavior unless you edit your /etc/hosts file (on a Mac or Linux machine) or a similar registry entry on Windows. For example, since my Raspberry Pi with hostname rpi has ip address 10.0.0.24, if I add the line:
10.0.0.24 rpi
to the end of the
/etc/hosts
file on my Mac laptop, then I can drop the .local suffix when I connect to the Raspberry Pi either through a web interface like phpmyadmin, or through a terminal with the ssh command.
Phpmyadmin's web interfaces allows novice SQL users to administer a server, and easily add or remove new and existing databases/tables/users.
Read on to page 3 for installation of Docker and the raspberrypi-jupyterlab image I added to the DockerHub repository edenbaus/raspberrypi-jupyterlab
.