Optimizing Productivity with ChatGPT: How Smart Data Scientists are Using AI to Work Smarter, Not Harder
In a world where data-driven decision-making is key to gaining a competitive edge, business leaders are turning to cutting-edge technology like generative AI to unlock the full potential of data. The unveiling and adoption of ChatGPT brings us closer to this goal, empowering us to make informed decisions that are backed by data. ChatGPT can be leveraged for a variety of use cases, including boosting productivity, optimizing schedules, refining content, and simplifying technical concepts for non-technical audiences. By providing accessibility and facilitating learning, ChatGPT enables us to connect the dots and make a positive impact on the future. Perhaps most importantly, ChatGPT helps us save time, our most precious and limited resource.
The goal of documenting our own use of ChatGPT in the bootcamp curriculum is to share our insights for the benefit of current and future bootcamper participants, as well as readers interested in how they may be able to make use of ChatGPT. Our current objective is to document how each of us, as co-authors, utilize ChatGPT to reinforce our learning and advance the progress of our data science projects. Let's get started!
Debbie's Perspective
Check out my slides for a summary.
As a newcomer to R, I've found ChatGPT to be extremely helpful for converting Python code into R code and then into R Shiny code for public consumption. With intermediate knowledge of Python, I've found that constructing what I need in Python first and feeding these specific prompts into the user interface saves me a lot of time. However, when dealing with a large quantity of visualizations, debugging can become challenging, especially when all your transformations and ggplots are in one server.R file. To overcome this challenge, I suggest separating different visualization transformations using source('XXX.R') and saving your code when it works to make continuous progress over time. Limiting the amount of time allowed to work on each visualization can also help you generate a minimum viable product (MVP) and stay motivated.
If you're encountering errors in your R Shiny Code, ChatGPT can help you debug your R code. However, it's important to keep in mind that it has some limitations, such as not knowing the quirks of your dataset, requiring manual intervention to add code explicitly such as to bypass summing records with nulls. To overcome these challenges, you can run lines to check the output and ensure that it matches the expected results from your Python code.
Using ChatGPT can sometimes be challenging because it may not always comprehend your exact intentions. For instance, if you request it to create a heatmap without providing specific instructions, it may not generate the desired outcome immediately. Nevertheless, you can help it understand your goals more accurately by experimenting and refining your prompts. One helpful trick is to start by asking it to create a simple map, then gradually provide more detailed instructions for building a customized heatmap that meets your project requirements. I suggest interacting with it as if you were working alongside a colleague to facilitate the production of valuable insights and visualizations.
It can also help you summarize lengthy articles or blog posts. While it may not always capture the specific point you want to summarize, it can help you pick out some of the key points. To make the most of this tool, I suggest feeding it the specific section you want summarized or don't understand and verifying that the point it's making aligns with the author's.
ChatGPT can also be useful when planning presentations or editing blog posts, like this one. However, it's important to remember that its style may need some tweaks, as it may generate what sounds like marketing copy and repeat points unnecessarily. That’s why you have to still take the time to edit the output to ensure that it sounds like your own voice and that the insights are accurate.
As I continue to use this new technology, I have found it to be valuable in enhancing my retention of new material when going through new modules or during live learning sessions. This would not only prove beneficial for data science conversations down the road but also in my everyday life. As AI technology advances at a rapid pace, it's essential that we remain conscientious about building for good and responsibly leveraging its power to enrich our learning and skills. Let's continue to explore the potential of AI while staying mindful of its impact on individuals and society as a whole.
Daniel's Thoughts
With its natural language input and general effectiveness, ChatGPT is a boon for productivity. However, a cursory use of ChatGPT in a field you know well will often lead to statements you know are incorrect. This raises two very important questions:
- How do we account for inaccuracies?
- How do we ensure we don’t learn incorrect knowledge from ChatGPT?
The good news is that ChatGPT excels at basic things. Boilerplate code and simple scripts can be quickly produced. The code is usually readable and includes comments. Consequently it can usually run with just a bit of adjustment. This is especially helpful for languages you seldom use (for me, bash scripts). Note that for an actual codebase, you should make an effort to adjust generated code to the paradigm(s) used in that codecase; otherwise, you may find yourself in a confusing zoo.
Writing boilerplate is often one of the most boring parts of a project, so getting it out of the way lets you address the meat far more quickly. ChatGPT also knows the documentation of well-tread packages and libraries fairly well, so it can describe unfamiliar syntax or API calls for which you have an example. Misunderstandings arise from more complex prompts, but I’ve had some reasonable success building more complicated code iteratively.
There are profound implications to these capabilities. We can do boring work orders of magnitude more quickly and reduce our cognitive load, letting us focus on what matters. However, some potential pitfalls are born from these advantages. A major concern is hitting knowledge plateaus or not being fully confident about your knowledge. During my time teaching, I realized that students with an overreliance on calculators were less confident and did not perform as well as those who could also do calculations on their own. Of course, using a calculator doesn’t make someone worse at arithmetic in and of itself, but students who did well had different patterns of use.
The same will likely be true of ChatGPT: those who use it in a disengaged manner or as an all-knowing oracle leave themselves vulnerable to stasis or misunderstandings. Worse, successively doing so could instill deep misconceptions which are difficult to untangle.
But it is possible to intentionally counteract these learning-related issues. As with most things, it’s down to how you relate to it. Passive interaction can lead to inertia. To counteract that, you have to be engaged and curious while using ChatGPT. Explicitly think about what a response teaches. Take note of any unfamiliar syntax and try implementing it in other ways to gain further familiarity. Especially for iteratively-built scripts with non-trivial complexity, deliberately consider how parts of the code interact with each other. Ask for explanations of code you don’t understand to ascertain that they’re consistent with your knowledge.
Many new LLM-related technologies are being developed, some filling a similar niche to ChatGPT. Hence, it may be worthwhile to test out these alternate technologies. Microsoft’s Bing is free, GPT-4 powered, and includes information from searches within its context window. This allows you to check out citations to see if the information remained intact in the response. Google is currently making moves to improve the coding functionality of Bard (also free). These search capabilities mean that neither Bing nor Bard are restricted to the September 2021 training data cut-off that ChatGPT is. One other website to consider is phind (pronounced find). This is a free GPT-4-powered search engine geared toward developers. For questions about documentation or API, phind is my go-to resource these days.
Recently, we reached the point where LLM-powered tech is useful professionally, but ultimately, we are still at the early-adopter phase of this technology. Models will improve, related tools will be integrated into workflows, and our habits of use will evolve alongside. As the technology becomes more robust and trustworthy, it becomes easier to trust blindly like a calculator. But that’s a trap. Don’t fall for it - stay engaged. Want to become or remain an expert? Don’t shirk your foundations.
Sam's Tips
Here are my 7 use-case tips to get you started (links to screen recordings)
- Demystify code: Copy a section of code and ask ChatGPT to explain it in layman's terms line by line. This will help you better understand the code.
- Enhance code quality: Share your code with ChatGPT and ask for suggestions on how to improve it in terms of efficiency, readability, or cost.
- Develop your Shiny app UI: Seek assistance from ChatGPT in starting your R ui.R file or any other components of your Shiny app.
- Write blog posts: Provide ChatGPT with the transcript of your presentation and ask it to help you start or structure a blog post based on the content.
- Manipulate data: Ask ChatGPT for help with filtering, combining, creating new columns, checking for NaNs, plotting, writing functions, or suggesting useful statistical analyses foryour DataFrame. This can be a quicker alternative to referring to your cheat sheet.
- Troubleshoot errors: Share your code and any associated errors with ChatGPT, and ask for guidance on what might be wrong or how to troubleshoot the issue.
- Perform and comprehend statistical analysis: Discuss your desired outcomes with ChatGPT and ask for recommendations on which statistical analyses might be most useful for achieving those results. *This demonstrates several minutes of wrestling with chat gpt to get to the desired goal.
In summary, we hope that future students will build on top of this blog post and share their own experiences, enabling us all to see how ChatGPT can serve as a tool to give us the intellectual freedom to advance this field even further. Together, we can continue to push the boundaries of what is possible with AI and data science.