Series Collection¶
Often users of the FRED API will want analyze multiple economic series. This can be done with FredSeries alone, but can be tedious and cumbersome. pyfredapi offers the SeriesCollection class to streamline the process of collecting and munging data for plotting and analysis.
A SeriesCollection object is a set of SeriesData objects. SeriesCollection provide helpful methods to:
- List the metadata (frequency, seasonality, units, etc.) of the series in the collection
- Merge series dataframes into a long dataframe
- Merge series dataframes into a wide dataframe by index
- Merge series dataframes into a wide dataframe by date
Setup¶
Import pyfredapi
from rich.pretty import pprint
import pyfredapi as pf
Create a SeriesCollection¶
Create an instance of SeriesCollection,
Add data to the collection with add_series(). By default the column for the series values will be renamed to the series id.
sc = pf.SeriesCollection(series_id=["GDP"])
Requesting series GDP...
Collect additional series¶
Add more series to a SeriesCollection object with add().
sc.add(series_id=["SP500"])
Requesting series SP500...
Remove series¶
Remove series from the collection with remove().
sc.remove("SP500")
Removed series SP500
fig = sc.GDP.plot()
# choose the render appropriate for your environment
fig.show(renderer="sphinx_gallery")
Accessing the data¶
The SeriesCollection is composed of SeriesData objects. You can access the SeriesData by attribute. Each series_id added to the collection will be an attribute that returns the SeriesData object for that series.
SeriesData is has two attributes.
info- The series metadata.df- Series observations in a pandas dataframe.
Access via attribute¶
sc.GDP == sc["GDP"]
True
pprint(sc.GDP.info)
SeriesInfo( │ id='GDP', │ realtime_start='2024-10-30', │ realtime_end='2024-10-30', │ title='Gross Domestic Product', │ observation_start='1947-01-01', │ observation_end='2024-07-01', │ frequency='Quarterly', │ frequency_short='Q', │ units='Billions of Dollars', │ units_short='Bil. of $', │ seasonal_adjustment='Seasonally Adjusted Annual Rate', │ seasonal_adjustment_short='SAAR', │ last_updated='2024-10-30 07:54:01-05', │ popularity=93, │ notes='BEA Account Code: A191RC\n\nGross domestic product (GDP), the featured measure of U.S. output, is the market value of the goods and services produced by labor and property located in the United States.For more information, see the Guide to the National Income and Product Accounts of the United States (NIPA) and the Bureau of Economic Analysis (http://www.bea.gov/national/pdf/nipaguid.pdf).' )
sc.GDP.df.tail()
| date | GDP | |
|---|---|---|
| 310 | 2023-07-01 | 27967.697 |
| 311 | 2023-10-01 | 28296.967 |
| 312 | 2024-01-01 | 28624.069 |
| 313 | 2024-04-01 | 29016.714 |
| 314 | 2024-07-01 | 29349.924 |
Access via bracket notation¶
pprint(sc["GDP"].info)
SeriesInfo( │ id='GDP', │ realtime_start='2024-10-30', │ realtime_end='2024-10-30', │ title='Gross Domestic Product', │ observation_start='1947-01-01', │ observation_end='2024-07-01', │ frequency='Quarterly', │ frequency_short='Q', │ units='Billions of Dollars', │ units_short='Bil. of $', │ seasonal_adjustment='Seasonally Adjusted Annual Rate', │ seasonal_adjustment_short='SAAR', │ last_updated='2024-10-30 07:54:01-05', │ popularity=93, │ notes='BEA Account Code: A191RC\n\nGross domestic product (GDP), the featured measure of U.S. output, is the market value of the goods and services produced by labor and property located in the United States.For more information, see the Guide to the National Income and Product Accounts of the United States (NIPA) and the Bureau of Economic Analysis (http://www.bea.gov/national/pdf/nipaguid.pdf).' )
Rename series in the collection¶
Rename on add¶
You can rename the series when adding them to the collection. Renaming can be done with a dictionary mapping the series id to the new name, or with a function which parses the series title into the new name.
# Rename with a dictionary
new_names = {
"CPIAUCSL": "cpi_all_items",
"CPILFESL": "cpi_all_items_less_food_and_energy",
}
cpi_sc = pf.SeriesCollection(series_id=["CPIAUCSL", "CPILFESL"], rename=new_names)
Requesting series CPIAUCSL... Requesting series CPILFESL...
cpi_sc.CPIAUCSL.df.head()
| date | cpi_all_items | |
|---|---|---|
| 0 | 1947-01-01 | 21.48 |
| 1 | 1947-02-01 | 21.62 |
| 2 | 1947-03-01 | 22.00 |
| 3 | 1947-04-01 | 22.00 |
| 4 | 1947-05-01 | 21.95 |
cpi_sc.CPILFESL.df.head()
| date | cpi_all_items_less_food_and_energy | |
|---|---|---|
| 0 | 1957-01-01 | 28.5 |
| 1 | 1957-02-01 | 28.6 |
| 2 | 1957-03-01 | 28.7 |
| 3 | 1957-04-01 | 28.8 |
| 4 | 1957-05-01 | 28.8 |
Rename after add¶
You can rename series in the collection with the rename_series method. Works the same way as renaming on add.
def parse_cpi_title(title: str) -> str:
"""Parse CPI series title into a readable label."""
return (
title.lower()
.replace("consumer price index", "CPI ")
.replace(" for all urban consumers: ", "")
.replace(" in u.s. city average", "")
.title()
)
cpi_sc.rename_series(rename=parse_cpi_title)
cpi_sc.CPIAUCSL.df.head()
| date | Cpi All Items | |
|---|---|---|
| 0 | 1947-01-01 | 21.48 |
| 1 | 1947-02-01 | 21.62 |
| 2 | 1947-03-01 | 22.00 |
| 3 | 1947-04-01 | 22.00 |
| 4 | 1947-05-01 | 21.95 |
List metadata¶
SeriesCollection has a number of list methods to print out the metadata of the series in the collection.
Series in the collection¶
cpi_sc.list_series()
CPIAUCSL: Consumer Price Index for All Urban Consumers: All Items in U.S. City Average CPILFESL: Consumer Price Index for All Urban Consumers: All Items Less Food and Energy in U.S. City Average
Frequency¶
cpi_sc.list_frequency()
All series are Monthly
Seasonality¶
cpi_sc.list_seasonality()
All series are Seasonally Adjusted
Units¶
cpi_sc.list_units()
All series are that are measured in Index 1982-1984=100
Dates¶
cpi_sc.list_end_date()
All series end on 2024-09-01
cpi_sc.list_start_date()
Series that start on 1957-01-01 CPILFESL: Consumer Price Index for All Urban Consumers: All Items Less Food and Energy in U.S. City Average Series that start on 1947-01-01 CPIAUCSL: Consumer Price Index for All Urban Consumers: All Items in U.S. City Average
Merge data¶
SeriesCollection supports merging the data into long and wide formats. By default the series ID will be used as the column name or observation label.
Merge long¶
Merge the series in the collection into a long pandas dataframe.
cpi_long = cpi_sc.merge_long()
cpi_long
| date | value | series | |
|---|---|---|---|
| 0 | 1947-01-01 | 21.480 | Cpi All Items |
| 1 | 1947-02-01 | 21.620 | Cpi All Items |
| 2 | 1947-03-01 | 22.000 | Cpi All Items |
| 3 | 1947-04-01 | 22.000 | Cpi All Items |
| 4 | 1947-05-01 | 21.950 | Cpi All Items |
| ... | ... | ... | ... |
| 1741 | 2024-05-01 | 318.140 | Cpi All Items Less Food And Energy |
| 1742 | 2024-06-01 | 318.346 | Cpi All Items Less Food And Energy |
| 1743 | 2024-07-01 | 318.872 | Cpi All Items Less Food And Energy |
| 1744 | 2024-08-01 | 319.768 | Cpi All Items Less Food And Energy |
| 1745 | 2024-09-01 | 320.767 | Cpi All Items Less Food And Energy |
1746 rows × 3 columns
Merge as-of¶
Merge the series in the collection into a wide pandas dataframe based on nearest date. Must define a base series. The base series defines the set of dates to serve of the basis of joining.
cpi_asof = cpi_sc.merge_asof(base_series_id="CPIAUCSL")
cpi_asof.tail()
| date | Cpi All Items | Cpi All Items Less Food And Energy | |
|---|---|---|---|
| 928 | 2024-05-01 | 313.225 | 318.140 |
| 929 | 2024-06-01 | 313.049 | 318.346 |
| 930 | 2024-07-01 | 313.534 | 318.872 |
| 931 | 2024-08-01 | 314.121 | 319.768 |
| 932 | 2024-09-01 | 314.686 | 320.767 |
Merge wide¶
Merge the series in the collection into a wide pandas dataframe. Only works if all the series in the collection share the same date index.
cpi_wide = cpi_sc.merge_wide()
cpi_wide.tail()
| date | Cpi All Items | Cpi All Items Less Food And Energy | |
|---|---|---|---|
| 928 | 2024-05-01 | 313.225 | 318.140 |
| 929 | 2024-06-01 | 313.049 | 318.346 |
| 930 | 2024-07-01 | 313.534 | 318.872 |
| 931 | 2024-08-01 | 314.121 | 319.768 |
| 932 | 2024-09-01 | 314.686 | 320.767 |