11  Chapter 10: Leveraging GenAI for Data Science and Programming in R

11.0.1 Key Topics

  • Introduction to GenAI Tools:
    • Overview of Generative AI tools for data science: GitHub Copilot, ChatGPT, and R-based AI libraries.
    • Using AI for code generation, debugging, and workflow automation.
  • Using GenAI for Coding Efficiency:
    • Demonstration of GitHub Copilot to assist in writing R code, generate data analysis pipelines, and suggest improvements.
    • Practical example: Use Copilot to build a Shiny app, automate API calls, or visualize data more efficiently.

11.0.2 Outcome

Participants will understand how to leverage GenAI tools to enhance their coding efficiency and explore new possibilities in data science.

11.1 Comparison of GitHub Copilot and ChatGPT

Feature GitHub Copilot ChatGPT
Primary Function AI-powered code completion and suggestion Conversational AI with coding capabilities
Integration Directly integrates with IDEs like VS Code Accessible via web interface and API
Strengths Context-aware code suggestions in real-time Versatile across different domains
Best For Daily coding tasks within IDEs Experimentation and ideation
Weaknesses Limited to coding environments Requires more interpretation for code use

GitHub Copilot excels in providing real-time, context-aware suggestions directly within an IDE, making it ideal for developers focused on software development tasks[1][2]. ChatGPT offers broader conversational capabilities, suitable for exploring new ideas or gaining detailed explanations[3][5].

11.2 Obtaining Free Accounts

11.2.1 GitHub Copilot

  1. Sign Up for GitHub:
  2. Access Copilot:
    • Visit the GitHub Copilot page to start a free trial or use it as part of certain GitHub plans.

11.2.2 ChatGPT

  1. OpenAI Account:
  2. Access ChatGPT:
    • Use ChatGPT through the OpenAI website or integrate it via the API.

11.3 Integrating GenAI Tools into RStudio

11.3.1 GitHub Copilot Integration

  1. Install Visual Studio Code (VS Code):
  2. Install GitHub Copilot Extension:
    • In VS Code, go to Extensions and search for “GitHub Copilot” to install it.
  3. Use with R Code:
    • Open an R script in VS Code and start typing; Copilot will suggest code completions.

11.3.2 Using ChatGPT with RStudio

  1. API Access:
    • Obtain API keys from OpenAI after setting up your account.
  2. Integrate via R Packages:
    • Use packages like httr or plumber to send requests to the ChatGPT API from RStudio.
library(httr)

# Example request to ChatGPT API
response <- POST(
  url = "https://api.openai.com/v1/engines/davinci-codex/completions",
  add_headers(Authorization = paste("Bearer", "your_api_key")),
  body = list(prompt = "Write an R function to calculate mean", max_tokens = 100),
  encode = "json"
)

content(response)

11.4 Using AI for Code Generation, Debugging, and Workflow Automation

11.4.1 Example: Using GitHub Copilot for Shiny App Development

  1. Building a Shiny App:
    • Start by writing comments describing the app’s functionality.
    • Let Copilot suggest code snippets based on these comments.
# Create a simple Shiny app with a slider input
library(shiny)

# Define UI
ui <- fluidPage(
  titlePanel("Simple Shiny App"),
  sidebarLayout(
    sidebarPanel(
      sliderInput("obs", "Number of observations:", min = 1, max = 1000, value = 500)
    ),
    mainPanel(
      plotOutput("distPlot")
    )
  )
)

# Define server logic
server <- function(input, output) {
  output$distPlot <- renderPlot({
    hist(rnorm(input$obs))
  })
}

# Run the application 
shinyApp(ui = ui, server = server)
  1. Automating API Calls:
    • Use comments to describe the API interaction.
    • Allow Copilot to generate the necessary code structure.
# Fetch weather data from an API
library(httr)

response <- GET("https://api.weatherapi.com/v1/current.json?key=your_api_key&q=London")
weather_data <- content(response)

print(weather_data)
  1. Visualizing Data Efficiently:
    • Describe the desired visualization.
    • Use Copilot’s suggestions to quickly generate plots.
# Plotting using ggplot2
library(ggplot2)

# Create a scatter plot
ggplot(mtcars, aes(x = wt, y = mpg)) +
  geom_point() +
  labs(title = "Scatter plot of Weight vs MPG")

11.5 References

By following these steps and utilizing these tools, participants will enhance their programming efficiency and explore innovative solutions using AI-assisted technologies. ```

11.5.1 Recap

  • Comparison Table: Highlights key differences between GitHub Copilot and ChatGPT based on functionality, integration, strengths, and weaknesses.
  • Account Setup Instructions: Provides steps to obtain free accounts for both tools.
  • Integration Guide: Offers practical advice on integrating these AI tools into coding environments like RStudio.
  • Examples: Demonstrates practical applications of AI in coding tasks such as building Shiny apps and automating workflows.
  • References: Lists resources for further exploration of GenAI tools.

This chapter equips participants with the knowledge to effectively incorporate AI tools into their data science workflows.

Sources [1] How to Use Github Copilot in RStudio in order to write code better … https://www.youtube.com/watch?v=u3g9hNvK314 [2] GitHub Copilot in Rstudio, it’s finally here! - YouTube https://www.youtube.com/watch?v=yVq-b5xHmac [3] GitHub Copilot in RStudio and VS Code - Tilburg Science Hub https://tilburgsciencehub.com/topics/automation/ai/gpt-models/github-copilot/ [4] GitHub Copilot for R - First impressions - YouTube https://www.youtube.com/watch?v=NGM7Z1Dd9fE [5] Comparing GitHub Copilot and ChatGPT: A Developer’s Perspective · community · Discussion #64644 https://github.com/orgs/community/discussions/64644 [6] RStudio User Guide - GitHub Copilot - Posit Docs https://docs.posit.co/ide/user/ide/guide/tools/copilot.html [7] How to use GitHub Copilot in RStudio - Tilburg.ai https://tilburg.ai/2023/12/github-copilot-for-r/ [8] GitHub Copilot overview - Visual Studio Code https://code.visualstudio.com/docs/copilot/overview