# Cleaning Headers

This article explains how to clean and simplify headers and labels during import so that variables created from PDFs, Excel or CSV have clear, usable names.

You will learn how to:

* Deal with multi line and messy headers.
* Remove repeated header rows.
* Strip out footnotes from labels.
* Make row labels meaningful for variables and reports.

***

## Why header cleaning matters

Imported headers often come directly from external reports and may contain:

* Long descriptive text that is not helpful as a variable name.
* Footnote markers, references or symbols.
* Repeated boilerplate such as `For the year ended`.
* Inconsistent wording across similar items.

If you use these raw labels for variables, your model becomes harder to search and understand.

Cleaning headers at import time gives you:

* Short, consistent names for variables and categories.
* Cleaner reports and dashboards.
* Easier reuse of imported variables across branches and scenarios.

***

## Common header issues

Typical issues include:

* Multi line headers where Excel or PDF wraps text into several lines in the same cell.
* Header rows inside the table repeated on each page of a report.
* Footnotes in labels such as `Revenue*` with `* see note 3`.
* Overly long labels copied from disclosure language.

All of these can be addressed during the mapping and cleaning steps.

***

## Simplifying row labels

For each row label that will become a variable or category name, aim for:

* A short, descriptive phrase.
* Consistent wording across similar items.
* No footnote markers or extraneous punctuation.

Examples:

* `Revenue from contracts with customers` → `Revenue`
* `Cost of sales` → `COGS`
* `Selling and distribution expenses` → `Opex - Selling`
* `Property, plant and equipment` → `Assets - Plant`

You can keep the original wording in a note if it is needed for audit purposes.

***

## Removing repeated header rows

When importing multi page tables from PDFs, repeated header rows can be pulled in as data rows. You should:

* Identify rows that are clearly headers (for example they repeat the column headings on each page).
* Mark them as non-data rows or delete them in the cleaning step.
* Ensure that only genuine data rows remain to be mapped to variables.

This prevents duplicate or meaningless variables being created.

***

## Handling footnotes and annotations

Footnote markers such as `*`, `†`, or `(1)` often appear in row labels or column headings. You should:

* Remove footnote markers from the label text.
* If the footnote content is important, copy it into a note on the variable after import.
* Avoid encoding footnote detail directly into the variable name.

This keeps variable names clean while preserving context where needed.

***

## Column headers and period labels

Column headers that represent periods (for example `2023`, `2024`) are handled by the date detection logic rather than header cleaning. You usually only need to clean:

* Column labels for non-date columns, such as `Actual`, `Budget`, `Variance`.
* Columns that will be imported as drivers or metadata.

Give these columns short, descriptive names if you plan to use them as drivers or filters in the model.

***

## Practical tips

{% hint style="info" %}

* Spend a few minutes cleaning headers on import rather than living with noisy names for the life of the model.
* Use a consistent naming pattern where you have similar lines across multiple imports.
* Keep a separate reference to the original labels if your workflow requires auditability.
  {% endhint %}

***

## Related articles

* [Campus/Program Consolidation](/use-cases/education-providers-private-vocational/campus-program-consolidation.md)
* [Build Scenario Aware Data Overrides](/how-tos/data-workflows-and-automation/build-scenario-aware-data-overrides.md)
* [Time & Periodicity Basics](/help/getting-started/time-and-periodicity-basics.md)
* [Units & Bounds Panel](/syntax/understanding-the-interface/units-and-bounds-panel.md)


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://help.modelreef.io/help/importing-and-data-inputs/cleaning-headers.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
