Project 2: Shiny Dashboard app - Data Scientist Salary Comparator

Sung Pil Moon
Posted on Feb 15, 2016

Contributed by Sung Pil Moon. He is currently in the NYC Data Science Academy 12 week full time Data Science Bootcamp program taking place between January 11th to April 1st, 2016. This post is based on his second class project - R Shiny (due on the 4th week of the program)

1. Introduction

Shiny app provides efficient ways to manipulate and visualize data. It allows users with and without enough R expertise to explore the data as well as find insights from it. I implemented the Data Scientist Salary Comparator  using Shiny package to explore salary data of 8 professions based on the data about prevailing wage of foreign employers in the United States. The 8 professions include:

  • Data scientist
  • Software engineers
  • Data Analyst
  • Business Analyst
  • Management Consultant
  • Assistant Professor
  • Attorney
  • Teacher

 

data set:

  • It is about the prevailing wage data of foreign employers seeking to file applications in the Permanent Labor Certification Program (PERM), the H-1B, H-1B1, and E-3 Professional and Specialty Occupation Programs, and the H-2B Non-agricultural Temporary Labor Certification Program.
  • The dataset is from the United States Department of Labor, Employment & Training Administration. (Source)
  • The prevailing wage data of US natives are not included
  • The filtered data for this application contains total 167,278 cases (in 19 columns) in 2015

 

2. Structures

The Data Scientist Salary Comparator Shiny Application has basically 5 components:

  • Salary Scatter Plot
  • Salary Data Explorer
  • Salary Comparison Map
  • Top Recruiters
  • External Info

 

2.1. DS Salary Scatter Plot

This 'Salary Scatter Plot' panel shows the salary distribution by 8 different jobs. It comprises of three sections: an option input section, a plot area section, and an aggregate summary box section.

SPM_SalaryScatterPlot

The plot area shows two types of visualizations: a scatter plot showing all the salary data by 8 professions, and a box plot showing values of minimum, 25 percent quantile, median, 75 percent quantile, and maximum. Users can toggle the 'showing data points' option above the plot so that they can only see the boxplot alone. Users also can interactively change the options of the target states (All or one state among 50), and target salary range. Corresponding changes are updated on the plot area and the aggregate summary boxes as soon as users made any change.

Salary data of assistant professor are in red color, attorney salary are in orange color, business analyst salary are in light green color, data analyst salary are in green color, data scientist salary are in teal color , management consultant salary are in turkey blue color, software engineer salary are in purple and teacher salary are in red violet color.

 

2.2. DS Salary Data Explorer

This 'Salary Data Explorer' is a data table having features of filtering, paginating, searching, and sorting to explore the data of users' interests.
Like the scatter plot panel, users can interactively choose the options, then the table shows updated result. The data of the table can be filtered by profession (multiple choices), state, salary range, and name (of city and employer)

SPM_SalarayExploreTable

 

2.3. DS Salary Comparison Map

The 'Salary Comparison Map' provides a way to compare salary distribution of two professions in the United States. Users can choose the professions (job titles), then the distribution map and data table will show the updated result. Users can also sort the results in the table by state, average salary and the number of jobs. (Note that when the panel is initialized, it will show all the data which is not filtered by state, profession, average salary and the number of jobs.)

SPM_SalaryComparisonMap

 

2.4. Top Recruiter tables

The 'Top Recruiter Tables' panel comprises of 5 salary data tables showing who the top recruiters are for each profession. Each table contains employer names, the number of jobs, average salary, the minimum salary, the 25% quantile salary, median salary, the 75% quantile salary, and the maximum salary.

The first table intentionally shows the salary data without distinguishing the profession to provide an overall idea who the top recruiters are regardless of a profession across the United States. However, other four remaining tables provide summary tables filtering options by states and specific professions: data scientist, software engineer, data analyst, and other professions. (The tables are sorted by the number of jobs and the average salary in descending order.SPM_TopRecruiter1a

SPM_TopRecruiter2

 

2.5. External Recources

This 'External Resources' panel shows a collection of valuable and meaningful information from external sources. The external resources are embedded or regenerated for better readability and interactivity. (All sources and author names of the external resources are included)

SPM_External Sources

 

3. Code

Since this Shiny application has long lines of codes which also contain some duplicates, only the essential code snippets are described here. You can access the full code in my github page (here).

 

a. UI part

This code snippet shows a basic structure of UI part in the shiny dashboard having three components (header, sidebar, and body). The sidebar has 5 sidebar menuItems and the body part has 5 corresponding 'tabItems' components under one 'tabItems' component. (Duplicates are intentionally omitted)

header <- dashboardHeader(
  title = "DS Salary Explorer"
)

sidebar <- dashboardSidebar(
  sidebarMenu(
    menuItem("Salary Scatter Plot", 
             tabName = "myTabForScatterPlot", 
             icon = icon("bar-chart-o")),
    menuItem("Salary Data Explorer", 
             tabName = "myTabForDataTable", 
             icon = icon("fa fa-table")),    
    menuItem("Salary Comparison Map", 
             tabName = "myTabForGvisMap", 
             icon = icon("fa fa-map-marker")),
    menuItem("Top Recruiters", 
             tabName = "myTabForRecruitRanking", 
             icon = icon("fa fa-list-ol")),
    menuItem("External Info", 
             tabName = "myTabForExternalInfo", 
             icon = icon("fa fa-external-link"))
  )
) 

body <- dashboardBody(
  
  tabItems(
    tabItem("myTabForScatterPlot", h2("Salary Data Scatter Plot"),
            # ... more sub components in this tabItem
    ), 
    tabItem("myTabForDataTable", h2("DS Sarary Data Explorer"),
            # ... more sub components in this tabItem
    ),
    tabItem("myTabForGvisMap", h2("Salary Comparison Map"),
            
            fluidRow(
              box(
                title = "Map 1", solidHeader = TRUE, collapsible = TRUE,
                htmlOutput("myGvisMap1") 
              ),
              box(
                title = "Map 2", solidHeader = TRUE, collapsible = TRUE,
                htmlOutput("myGvisMap2") 
              )
            ),
            fluidRow(
              box(
                title = "DataTable for Map 1", 
                solidHeader = TRUE, collapsible = TRUE,
                DT::dataTableOutput("myComparisonTableByJobTitle1")
              ),
              box(
                title = "DataTable for Map 2", 
                solidHeader = TRUE, collapsible = TRUE,
                DT::dataTableOutput("myComparisonTableByJobTitle2")
              )
            )
    ),
    tabItem("myTabForRecruitRanking", h2("Top Recruiters"),
            # ... more sub components in this tabItem
    ),
    tabItem("myTabForExternalInfo", h2("External sources"),
            # ... more sub components in this tabItem
    )
  ) # end of tabItems
) # end of body

 

b. Server part

This code snippet below shows a basic structure of server part how to manipulate the data based on user input. The code snippet below shows how to react user input for the comparison map and comparison data table. Briefly speaking, the comparison map and the data table catches the user input and send them to the updateInputDataForMapByJobTitle1() function which returns the filtered data so that the map and data table synchronously show the updated results.

server <- function(input, output) { 
  
  # ...
  # ... Other functions are intentionally omitted for brevity ...
  # ...
  
  #///////////////////////////////////////////////////////////////////////////
  # reactive function for comparison Map 1 and comparison table 1
  #///////////////////////////////////////////////////////////////////////////    
  updateInputDataForMapByJobTitle1 <- reactive({  
    
    # Data filtering from the original data 'salary_refined'
    dataFilteredForMapByJobTitle1 <- salary_refined   
    dataFilteredForMapByJobTitle1 % 
      group_by(WORK_STATE, JOB_TITLE_SUBGROUP) %>% 
      summarise(AVG_SALARY= round(mean(PAID_WAGE_PER_YEAR), 2), NUM_POS = n())
       
    dataFilteredForMapByJobTitle1 # return the filtered data
    
  })
  
  #///////////////////////////////////////////////////////////////////////////
  # comparison Map 1 (googleVis)
  #///////////////////////////////////////////////////////////////////////////
  output$myGvisMap1 <- renderGvis({
    
    # call the updateInputDataForMapByJobTitle1() to get the filtered data 
    # This function call to updateInputDataForMapByJobTitle1() enables to 
    # synchronously react user input and show the updated results.
    mapData <- updateInputDataForMapByJobTitle1() 
    
    # Render the map using the filtered data
    gvisGeoChart(mapData, locationvar= "WORK_STATE", colorvar="AVG_SALARY",
                 options=list(region="US", displayMode="regions",
                              resolution="provinces", 
                              width="100%", 
                              backgroundColor="gray"
                 )
    )
  })  
  
  #///////////////////////////////////////////////////////////////////////////
  # Comparison Map 1 
  #///////////////////////////////////////////////////////////////////////////
  output$myComparisonTableByJobTitle1 <- DT::renderDataTable(DT::datatable({ 
    
    # call the updateInputDataForMapByJobTitle1() to get the filtered data 
    # This function call to updateInputDataForMapByJobTitle1() enables to 
    # synchronously react user input and show the updated results.
    dataForDTable1 <- updateInputDataForMapByJobTitle1() 
    
    # Change the call names
    colnames(dataForDTable1) <- c("STATE","JOB_TITLE","AVG_SALARY", "JOBS") 
    
    dataForDTable1 # filtered data for the dataTable
    
  }, rownames = FALSE, 
    extensions = c('ColVis','ColReorder','Scroller'), options = list(
    deferRender = TRUE,  
    searching = T,
    dom = 'RClfrtip',
    colVis = list(activate = 'mouseover'),
    lengthMenu = list(c(10, 5, 15, 25, 25, 50, 100), 
                      c('10', '5', '15', '20', '25','50','100'))
  )) %>% formatCurrency(c('AVG_SALARY'), "$") ) 
 
  # ...
  # ... Other functions are intentionally omitted for brevity ...
  # ...
  
}

 

c. Calling the Shiny Dashboard app

Then, you can call the shiny dashboard app like below

shinyApp(
  ui = dashboardPage(header, sidebar, body, skin = "black"), 
  server
)

 

  • If you have any suggestion, question, or reviews for my Shiny Dashboard app, please leave your comments. Also, if any of the information above is incorrect or needs to be updated, please send an email to [email protected]

About Author

Sung Pil Moon

Sung Pil Moon

Sung Moon is a recent graduate from the Ph.D. program in Human-Computer Interaction, School of Informatics, Indiana University (Indianapolis, IN). Through several startup activities and various research projects collaborating with MITRE, a research corporation, he found opportunities to...
View all posts by Sung Pil Moon >

Related Articles

Leave a Comment

Avatar
Google April 19, 2020
Google Wonderful story, reckoned we could combine several unrelated information, nonetheless really worth taking a appear, whoa did a single find out about Mid East has got a lot more problerms too.
Avatar
Anton July 10, 2017
Just as the Glee Club gets into it, Sue is already on her way to ending twerk at McKinley, igniting the war between her and Will one step additional.
Avatar
iphone 7 cases kickstand November 11, 2016
This paragraph is genuinely a good one it helps new net visitors, who are wishing inn favor of blogging.
Avatar
iphone 7 plus cases black November 2, 2016
I absolutely llove your blog.. Very nice colors & theme. Did you make this web site yourself? Please reply back as I'm hoping to create myy own personal blog and want to know where you got this from or just what the theme is called. Many thanks!
Avatar
playstation 4 8gb gddr5 October 30, 2016
What's up, constantly i used to check weblog posts here in the early hours in the morning, for the reason that i enjoy to gain knowledge of more and more.
Avatar
réplique sautoir van cleef alhambra September 7, 2016
cartierbraceletlove Love the new Throwback logo – looks awesome with your gorgeous new design! Also loving these artichokes! What a lovely combination of textures from unctuous yet crispy artichokes, to crunchy breadcrumbs … and then the piquant flavors of capers mixed with earthy artichokes and a little kick of garlic. My, oh, my – this is a winner! ? réplique sautoir van cleef alhambra http://www.bijouxclassique.net/high-quality-vintage-replica-van-cleef-necklace-alhambra-yellow-gold-carnelian-10-motifs-p315.html
Avatar
bijoux van cleef and arpels 10 motif imitation September 7, 2016
Openreach doesn’t necessarily bring any choice anyway. bijoux van cleef and arpels 10 motif imitation http://www.vancleefalhambra.com/fr/cheap-vintage-alhambra-ring-diamond-vcara40900-p333.html

View Posts by Categories


Our Recent Popular Posts


View Posts by Tags

#python #trainwithnycdsa 2019 airbnb Alex Baransky alumni Alumni Interview Alumni Reviews Alumni Spotlight alumni story Alumnus API Application artist aws beautiful soup Best Bootcamp Best Data Science 2019 Best Data Science Bootcamp Best Data Science Bootcamp 2020 Best Ranked Big Data Book Launch Book-Signing bootcamp Bootcamp Alumni Bootcamp Prep Bundles California Cancer Research capstone Career Career Day citibike clustering Coding Course Demo Course Report D3.js data Data Analyst data science Data Science Academy Data Science Bootcamp Data science jobs Data Science Reviews Data Scientist Data Scientist Jobs data visualization Deep Learning Demo Day Discount dplyr employer networking feature engineering Finance Financial Data Science Flask gbm Get Hired ggplot2 googleVis Hadoop higgs boson Hiring hiring partner events Hiring Partners Industry Experts Instructor Blog Instructor Interview Job Job Placement Jobs Jon Krohn JP Morgan Chase Kaggle Kickstarter lasso regression Lead Data Scienctist Lead Data Scientist leaflet linear regression Logistic Regression machine learning Maps matplotlib Medical Research Meet the team meetup Networking neural network Neural networks New Courses nlp NYC NYC Data Science nyc data science academy NYC Open Data NYCDSA NYCDSA Alumni Online Online Bootcamp Open Data painter pandas Part-time Portfolio Development prediction Prework Programming PwC python python machine learning python scrapy python web scraping python webscraping Python Workshop R R language R Programming R Shiny r studio R Visualization R Workshop R-bloggers random forest Ranking recommendation recommendation system regression Remote remote data science bootcamp Scrapy scrapy visualization seaborn Selenium sentiment analysis Shiny Shiny Dashboard Spark Special Special Summer Sports statistics streaming Student Interview Student Showcase SVM Switchup Tableau team TensorFlow Testimonial tf-idf Top Data Science Bootcamp twitter visualization web scraping Weekend Course What to expect word cloud word2vec XGBoost yelp