2  Your First Reproducible Document with Quarto

NoteLearning Objectives

By the end of this chapter, you will be able to:

  • Explain what Quarto is and how it differs from R Markdown
  • Describe the anatomy of a .qmd file
  • Use YAML headers to control document metadata and output options
  • Control code chunk behaviour using chunk options
  • Render a document to HTML, PDF, and Word
  • Create a simple reproducible report from a real dataset

2.1 Why Quarto?

For many years, R users relied on R Markdown (.Rmd) to write reproducible documents. Quarto is its successor — created by Posit (formerly RStudio) to be language-agnostic (it works with R, Python, Julia, and Observable JS) and to consolidate years of lessons learned from R Markdown.

The key conceptual shift is this: your document and your analysis are the same file. You do not write your analysis in R, copy the output, and paste it into a Word document. You write prose and code together in one .qmd file, and Quarto renders the whole thing into a polished output.

TipQuarto vs. R Markdown

If you are coming from R Markdown, here are the key differences:

Feature R Markdown Quarto
File extension .Rmd .qmd
Chunk options In {} header YAML-style #\| comments
Language support Primarily R R, Python, Julia, Observable
Publishing Various packages Built-in (quarto publish)
Books/websites bookdown, distill Built-in project types

Quarto can even render old .Rmd files without changes.

2.2 Anatomy of a .qmd File

Every Quarto document has three components: a YAML header, Markdown prose, and code chunks.

2.2.1 The YAML Header

The document begins with a YAML header delimited by triple dashes (---). It controls the document’s metadata and output format.

---
title: "My First Report"
author: "Pawan Kumar"
date: last-modified
format:
  html:
    toc: true
    code-fold: true
  pdf:
    toc: true
---

YAML is whitespace-sensitive. Indentation matters. Common YAML options include:

Table 2.1: Common Quarto YAML options
Option Purpose
title Document title
author Author name(s)
date Publication date (use last-modified to auto-update)
format Output format (html, pdf, docx)
execute Default code execution options
bibliography Path to .bib file for citations
toc Include table of contents
number-sections Number section headings

2.2.2 Markdown Prose

Everything outside of code chunks is Markdown — a lightweight markup language that uses plain-text symbols to indicate formatting.

# Heading 1
## Heading 2
### Heading 3

**Bold text**
*Italic text*
`inline code`

- Bullet point 1
- Bullet point 2
  - Nested bullet

1. Numbered item
2. Numbered item

[Link text](https://example.com)

> This is a blockquote.

| Col 1 | Col 2 |
|-------|-------|
| A     | 1     |
| B     | 2     |

2.2.3 Code Chunks

Code chunks are enclosed in triple backticks with {r} to specify the language:

```{r}
#| label: my-first-chunk
#| echo: true
#| eval: true

mean(c(10, 20, 30, 40))
```

The #| prefix marks chunk options — YAML-style directives that control how the chunk behaves.

2.3 Code Chunk Options

Chunk options are the most powerful way to control your document’s output. They are written as YAML comments at the top of each chunk:

Table 2.2: Most important Quarto chunk options
Option Type Default Effect
label string none Unique chunk identifier — required for cross-references
echo logical true Show the code in the output?
eval logical true Run the code?
include logical true Include chunk (code + output) in document?
warning logical true Show R warnings?
message logical true Show R messages?
output logical true Show printed output (text)?
fig-cap string none Figure caption
fig-width number 7 Figure width in inches
fig-height number 5 Figure height in inches
cache logical false Cache chunk results to disk

2.3.1 Demonstration: Controlling Output

Here is the same code with different chunk options to show the effect:

Show code and output (echo: true, output: true):

Code
summary(mtcars$mpg)
#>    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
#>   10.40   15.43   19.20   20.09   22.80   33.90

Show output only (echo: false):

#> This code was hidden, but it ran. Mean MPG: 20.09

Show code only, do not run (eval: false):

Code
# This code block will not be executed
very_slow_function_that_takes_hours()

Neither shown nor run (include: false):

The variable x was set to 42 in the hidden chunk above — we can use it inline!

2.3.2 Global Options with execute

Rather than setting options in every chunk, set defaults in the YAML header:

execute:
  echo: true
  warning: false
  message: false
  cache: false

Individual chunks can then override these defaults as needed.

2.4 Inline Code

One of the most powerful features of Quarto is inline code — R expressions embedded directly in your prose using backticks:

The dataset contains `` 32 `` rows and 
`` 11 `` columns. The mean fuel efficiency 
is `` 20.1 `` miles per gallon.

This renders as:

The dataset contains 32 rows and 11 columns. The mean fuel efficiency is 20.1 miles per gallon.

ImportantNever Hard-Code Numbers in Your Reports

If your dataset changes, inline code updates automatically. Hard-coded numbers go stale and introduce errors. Always compute values from code.

2.5 Rendering Documents

2.5.1 From RStudio

Click the Render button (or press Ctrl+Shift+K). RStudio will render to the format specified in the YAML header.

2.5.2 From the Terminal

# Render to the default format
quarto render document.qmd

# Render to a specific format
quarto render document.qmd --to html
quarto render document.qmd --to pdf
quarto render document.qmd --to docx

# Render all files in a project
quarto render

2.5.3 Multiple Output Formats

To render to multiple formats simultaneously, specify them all in the YAML:

format:
  html:
    toc: true
  pdf:
    toc: true
  docx:
    reference-doc: template.docx

2.6 Cross-References

Quarto has a built-in cross-reference system. Label a figure or table, then reference it from anywhere in the document.

Figures:

```{r}
#| label: fig-mpg-hist
#| fig-cap: "Distribution of fuel efficiency in the mtcars dataset"

hist(mtcars$mpg, main = "", xlab = "Miles per Gallon")
```

Then reference it as: As shown in @fig-mpg-hist...

Code
hist(mtcars$mpg,
     main = "",
     xlab = "Miles per Gallon",
     col  = "#3498db",
     border = "white")
Histogram showing distribution of MPG values
Figure 2.1: Distribution of fuel efficiency in the mtcars dataset.

As shown in Figure fig-mpg-hist, fuel efficiency follows an approximately normal distribution with a slight right skew.

Tables:

```{r}
#| label: tbl-mtcars-summary
#| tbl-cap: "Summary statistics for mtcars"

knitr::kable(summary(mtcars[, 1:4]))
```

Reference as: @tbl-mtcars-summary shows...

2.7 Callout Blocks

Callout blocks draw attention to important information:

::: {.callout-note}
## Note Title
This is a note.
:::

::: {.callout-tip}
## Tip Title
This is a tip.
:::

::: {.callout-warning}
## Warning Title
This is a warning.
:::

::: {.callout-important}
## Important Title
This is important.
:::

::: {.callout-caution}
## Caution Title
Proceed with caution.
:::

2.8 Hands-On Exercise: Your First Report

Let us create a complete mini-report using the mtcars dataset. This exercise brings together everything in this chapter.

Create a new file my-first-report.qmd and add the following:

---
title: "Motor Trend Cars: A Brief Analysis"
author: "Your Name"
date: last-modified
format:
  html:
    toc: true
    code-fold: true
    theme: flatly
execute:
  warning: false
  message: false
---

Then add these sections with code:

Code
# Load the data
data(mtcars)

# Add row names as a column
mtcars_df <- mtcars |>
  tibble::rownames_to_column("car") |>
  tibble::as_tibble()

# How large is the dataset?
n_cars <- nrow(mtcars_df)
n_vars <- ncol(mtcars_df)

The mtcars dataset contains data on 32 cars and 12 variables, extracted from the 1974 Motor Trend magazine.

Code
knitr::kable(head(mtcars_df[, 1:7]))
Table 2.3: First six rows of the mtcars dataset
car mpg cyl disp hp drat wt
Mazda RX4 21.0 6 160 110 3.90 2.620
Mazda RX4 Wag 21.0 6 160 110 3.90 2.875
Datsun 710 22.8 4 108 93 3.85 2.320
Hornet 4 Drive 21.4 6 258 110 3.08 3.215
Hornet Sportabout 18.7 8 360 175 3.15 3.440
Valiant 18.1 6 225 105 2.76 3.460
Code
library(ggplot2)

ggplot(mtcars_df, aes(x = factor(cyl), y = mpg, fill = factor(cyl))) +
  geom_boxplot(alpha = 0.7) +
  geom_jitter(width = 0.2, alpha = 0.5) +
  scale_fill_brewer(palette = "Set2") +
  labs(
    x     = "Number of Cylinders",
    y     = "Miles per Gallon",
    fill  = "Cylinders",
    title = "Fuel Efficiency by Number of Cylinders"
  ) +
  theme_minimal(base_size = 13) +
  theme(legend.position = "none")
Figure 2.2: Fuel efficiency varies substantially by number of cylinders.

Figure fig-mpg-cyl shows that cars with 4 cylinders achieve substantially better fuel efficiency than 6- or 8-cylinder cars.

NoteTry It Yourself

Extend this report by adding a scatter plot of mpg vs wt (weight). Add a linear regression line using geom_smooth(method = "lm"). What does the relationship tell you?

2.9 Publishing Your Document

Once you have a document you are proud of, Quarto makes publishing straightforward:

# Publish to Quarto Pub (free hosting)
quarto publish quarto-pub document.qmd

# Publish to GitHub Pages
quarto publish gh-pages

# Publish to Netlify
quarto publish netlify

For RStudio users: File → Publish Document provides a GUI for publishing to RPubs or Connect.

2.10 Exercises

  1. Create a new .qmd file. Write a YAML header that renders to both HTML and PDF. Set warning: false and message: false globally.

  2. Load the airquality dataset. Create a report with three sections: (a) an overview of the data using str() and inline code reporting the number of rows; (b) a histogram of ozone levels; (c) a cross-reference to that histogram in your prose.

  3. Demonstrate the difference between echo: false and include: false in the same document. Explain what you see in the output.

  4. Use a callout block to highlight one important finding from your airquality analysis.

  5. Challenge: Parameterise your report so that the variable being analysed (ozone, wind, or temperature) can be changed by modifying a single value at the top of the YAML. (Hint: look up Quarto parameters with params:.)