Skip to content

Commit

Permalink
Merge pull request #39 from Nixtla/docs/azure-vignette
Browse files Browse the repository at this point in the history
docs: add azure vignette
  • Loading branch information
MMenchero authored Oct 25, 2024
2 parents c2a132f + 9aa3c0e commit 8cd89c3
Show file tree
Hide file tree
Showing 19 changed files with 203 additions and 56 deletions.
2 changes: 2 additions & 0 deletions _pkgdown.yml
Original file line number Diff line number Diff line change
Expand Up @@ -19,6 +19,8 @@ navbar:
menu:
- text: "Anomaly Detection"
href: articles/anomaly-detection.html
- text: "Azure Quickstart"
href: articles/azure-quickstart.html
- text: "Cross-Validation"
href: articles/cross-validation.html
- text: "Data requirements"
Expand Down
Binary file added man/figures/azure_deploy.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added man/figures/azure_endpoints.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added man/figures/azure_landing.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added man/figures/azure_models.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file removed man/figures/diagram.png
Binary file not shown.
Binary file added man/figures/diagram_setup.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
9 changes: 5 additions & 4 deletions vignettes/anomaly-detection.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -31,7 +31,7 @@ library(nixtlar)
## 1. Anomaly detection
Anomaly detection plays a crucial role in time series analysis and forecasting. Anomalies, also known as outliers, are unusual observations that don't follow the expected time series patterns. They can be caused by a variety of factors, including errors in the data collection process, unexpected events, or sudden changes in the patterns of the time series. Anomalies can provide critical information about a system, like a potential problem or malfunction. After identifying them, it is important to understand what caused them, and then decide whether to remove, replace, or keep them.

`TimeGPT` has a method for detecting anomalies, and users can call it from `nixtlar`. This vignette will explain how to do this. It assumes you have already set up your API key. If you haven't done this, please read the [Get Started](https://nixtla.github.io/nixtlar/articles/anomaly-detection.html) vignette first.
`TimeGPT` has a method for detecting anomalies, and users can call it from `nixtlar`. This vignette will explain how to do this. It assumes you have already set up your API key. If you haven't done this, please read the [Get Started](https://nixtla.github.io/nixtlar/articles/get-started.html) vignette first.

## 2. Load data
For this vignette, we'll use the electricity consumption dataset that is included in `nixtlar`, which contains the hourly prices of five different electricity markets.
Expand All @@ -41,10 +41,11 @@ df <- nixtlar::electricity
head(df)
```

## 3. Detect anomalies
To detect anomalies, use `nixtlar::nixtla_client_detect_anomalies`, which should include the following parameter:
## 3. Detect Anomalies

- **df**: The time series data, either as a data frame, a tibble, or a tsibble. It should include at least a column with the timestamps and a column with the observations. Default names for these columns are `ds` and `y`. If different, please specify their names. If working with multiple series, you also need to include a column with unique identifiers. The default name for this column is `unique_id`.
To detect anomalies, use `nixtlar::nixtla_client_detect_anomalies`, which requires the following parameter:

- **df**: The time series data, provided as a data frame, tibble, or tsibble. It must include at least two columns: one for the timestamps and one for the observations. The default names for these columns are `ds` and `y`. If your column names are different, specify them with `time_col` and `target_col`, respectively. If you are working with multiple series, you must also include a column with unique identifiers. The default name for this column is `unique_id`; if different, specify it with `id_col`.

```{r}
nixtla_client_anomalies <- nixtlar::nixtla_client_detect_anomalies(df)
Expand Down
100 changes: 100 additions & 0 deletions vignettes/azure-quickstart.Rmd
Original file line number Diff line number Diff line change
@@ -0,0 +1,100 @@
---
title: "TimeGEN-1 Quickstart (Azure)"
output:
rmarkdown::html_vignette:
toc: true
toc_depth: 2
vignette: >
%\VignetteIndexEntry{Azure Quickstart}
%\VignetteEngine{knitr::rmarkdown}
%\VignetteEncoding{UTF-8}
---
```{r setup, include=FALSE}
library(httptest2)
.mockPaths("../tests/mocks")
start_vignette(dir = "../tests/mocks")
original_options <- options("NIXTLA_API_KEY"="dummy_api_key", digits=7)
knitr::opts_chunk$set(
collapse = TRUE,
comment = "#>",
fig.width = 7,
fig.height = 4
)
```

```{r}
library(nixtlar)
```

TimeGEN-1 is TimeGPT optimized for Azure, Microsoft's cloud computing service. You can easily access TimeGEN via `nixtlar`. To do this, just follow these steps:

## 1. Set up a TimeGEN-1 endpoint account and generate your API key on Azure.

- Go to [ml.azure.com](ml.azure.com)
- Sign in or create an account.
- If you don't have one already, create a workspace. This might require a subscription.

![](../man/figures/azure_landing.png)

- Click on `Models` in the sidebar and select `TimeGEN` in the model catalog.

![](../man/figures/azure_models.png)

- Click `Deploy`. This will create an Endpoint.

![](../man/figures/azure_deploy.png)

- Go to your Endpoint in the sidebar. Here you will find your Base URL and the API key.

![](../man/figures/azure_endpoints.png)

## 2. Install `nixtlar`

In your favorite R IDE, install `nixtlar` from CRAN or GitHub.

```{r, eval = FALSE}
install.packages("nixtlar") # CRAN version
library(devtools)
devtools::install_github("Nixtla/nixtlar")
```

## 3. Set up the Base URL and API key

To do this, use the `nixtla_client_setup` function.

```{r, eval = FALSE}
nixtla_client_setup(
base_url = "Base URL here",
api_key = "API key here"
)
```

## 4. Start making forecasts!

Now you can start making forecasts! We will use the electricity dataset that is included in `nixtlar`. This dataset contains the prices of different electricity markets.

```{r}
df <- nixtlar::electricity
nixtla_client_fcst <- nixtla_client_forecast(df, h = 8, level = c(80,95))
head(nixtla_client_fcst)
```

We can plot the forecasts with the `nixtla_client_plot` function.

```{r}
nixtla_client_plot(df, nixtla_client_fcst, max_insample_length = 200)
```

To learn more about data requirements and TimeGPT's capabilities, please read the nixtlar vignettes.

## Discover the power of TimeGEN on Azure via `nixtlar`.

Deploying TimeGEN via `nixtlar` on Azure allows you to implement robust and scalable forecasting solutions. This not only simplifies the integration of advanced analytics into your workflows but also ensures that you have the power of Azure’s cutting-edge technology at your disposal through a pay-as-you-go service. To learn more, read [here](https://www.nixtla.io/news/timegen1-on-azure).

```{r, include=FALSE}
options(original_options)
end_vignette()
```
4 changes: 2 additions & 2 deletions vignettes/cross-validation.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -31,7 +31,7 @@ library(nixtlar)
## 1. Time series cross-validation
Cross-validation is a method for evaluating the performance of a forecasting model. Given a time series, it is carried out by defining a sliding window across the historical data and then predicting the period following it. The accuracy of the model is computed by averaging the accuracy across all the cross-validation windows. This method results in a better estimation of the model’s predictive abilities, since it considers multiple periods instead of just one, while respecting the sequential nature of the data.

`TimeGPT` has a method for performing time series cross-validation, and users can call it from `nixtlar`. This vignette will explain how to do this. It assumes you have already set up your API key. If you haven't done this, please read the [Get Started](https://nixtla.github.io/nixtlar/articles/anomaly-detection.html) vignette first.
`TimeGPT` has a method for performing time series cross-validation, and users can call it from `nixtlar`. This vignette will explain how to do this. It assumes you have already set up your API key. If you haven't done this, please read the [Get Started](https://nixtla.github.io/nixtlar/articles/get-started.html) vignette first.

## 2. Load data
For this vignette, we'll use the electricity consumption dataset that is included in `nixtlar`, which contains the hourly prices of five different electricity markets.
Expand All @@ -44,7 +44,7 @@ head(df)
## 3. Perform time series cross-validation
To perform time series cross-validation using `TimeGPT`, use `nixtlar::nixtla_client_cross_validation`. The key parameters of this method are:

- **df**: The time series data, either as a data frame, a tibble, or a tsibble. It should include at least a column with the datestamps and a column with the observations. Default names for these columns are `ds` and `y`. If different, please specify their names. If working with multiple series, you also need to include a column with unique identifiers. The default name for this column is `unique_id`.
- **df**: The time series data, provided as a data frame, tibble, or tsibble. It must include at least two columns: one for the timestamps and one for the observations. The default names for these columns are `ds` and `y`. If your column names are different, specify them with `time_col` and `target_col`, respectively. If you are working with multiple series, you must also include a column with unique identifiers. The default name for this column is `unique_id`; if different, specify it with `id_col`.
- **h**: The forecast horizon.
- **n_windows**: The number of windows to evaluate. Default value is 1.
- **step_size**: The gap between each cross-validation window. Default value is `NULL`.
Expand Down
18 changes: 9 additions & 9 deletions vignettes/data-requirements.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -41,34 +41,34 @@ This vignette explains the data requirements for using any of the core functions

## 1. Input Requirements

`nixtlar` now supports the following data structures: data frames, tibbles, and tsibbles. The output format will always be a data frame.
`nixtlar` now supports the following data structures: data frames, tibbles, and tsibbles. The output format will always be a data frame.

Regardless of your data structure, the following two columns must always be included when using any core functions of `nixtlar`:

- **Date Column**: This column must contain timestamps formatted as `YYYY-MM-DD` or `YYYY-MM-DD hh:mm:ss`, either as character strings or date-time objects. The default name for this column is `ds`. If your dataset uses a different name, please specify it by setting the parameter `time_col="your_time_column_name"`.
- **Date Column**: This column must contain timestamps formatted as `YYYY-MM-DD` or `YYYY-MM-DD hh:mm:ss`, either as characters or date-time objects. For date-time objects, we recommend using the `as.POSIX*` functions from base R, although `as.Date` is also supported. The default name for this column is `ds`. If your dataset uses a different name, please specify it by setting the parameter `time_col="your_time_column_name"`.

- **Target Column**: This column should contain the numeric target variable for forecasting. The default name for this column is `y`. If your dataset uses a different name, specify it by setting the parameter `target_col="your_target_column_name"`.

## 2. Multiple Series

If you are working with multiple series, you must include a column with a unique identifier for each series. This column can contain characters or integers, and its default name is `unique_id`. If your dataset uses a different name for the identifier column, please specify it by setting the parameter `id_col="your_id_column_name"`. If your dataset contains only one series and does not need an identifier, set `id_col` to `NULL`.

Please be aware that in earlier versions of `nixtlar`, the default name for `id_col` was `NULL`, but it is now `unique_id`.

```{r}
# sample valid input
df <- nixtlar::electricity
head(df)
str(df)
```

The `id_col` only accepts characters or integers.

## 3. Exogenous Variables

When using exogenous variables, `nixtlar` differentiates between historical and future exogenous variables:
When using exogenous variables, `nixtlar` distinguishes between historical and future exogenous variables:

- **Historical Exogenous Variables**: These should be included in the input data immediately following the `id_col`, `ds`, and `y` columns. If your dataset contains additional columns that are not exogenous variables, you must remove them before using any core functions of `nixtlar`.

- **Future Exogenous Variables**: These correspond to the `X_df` parameter and should cover the entire forecast horizon. This dataset should include columns with the appropriate timestamps and, if available, unique identifiers, formatted as explained in previous sections.
- **Future Exogenous Variables**: These correspond to the `X_df` parameter and should cover the entire forecast horizon. This dataset must include columns with the appropriate timestamps and, if applicable, unique identifiers, formatted as described in the previous sections.

```{r}
# sample valid input with exogenous variables
Expand All @@ -83,11 +83,11 @@ To learn more about how to use exogenous variables, please refer to the [Exogeno

## 4. Missing values

When using `TimeGPT` via `nixtlar`, you need to ensure that:
When using `TimeGPT` via `nixtlar`, ensure the following:

1. **No Missing Values in Target Column**: The target column must not contain any missing values (NA).
1. **No Missing Values in the Target Column**: The target column must not contain any missing values (`NA`).

2. **Continuous Date Sequence**: The dates must be continuous and without any gaps, from the start date to the end date, matching the frequency of the data.
2. **Continuous Date Sequence**: The dates must be continuous, without any gaps, from the start date to the end date, matching the frequency of the data.

Currently, **nixtlar** does not provide any functionality to fill missing values or dates. To learn more about this, please refer to the vignette on [Special Topics](https://nixtla.github.io/nixtlar/articles/special-topics.html).

Expand Down
10 changes: 5 additions & 5 deletions vignettes/exogenous-variables.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -32,7 +32,7 @@ library(nixtlar)

Exogenous variables are external factors that provide additional information about the behavior of the target variable in time series forecasting. These variables, which are correlated with the target, can significantly improve predictions. Examples of exogenous variables include weather data, economic indicators, holiday markers, and promotional sales.

`TimeGPT` allows you to include exogenous variables when generating a forecast. This vignette will show you how to include them. It assumes you have already set up your API key. If you haven't done this, please read the [Get Started](https://nixtla.github.io/nixtlar/articles/anomaly-detection.html) vignette first.
`TimeGPT` allows you to include exogenous variables when generating a forecast. This vignette will show you how to include them. It assumes you have already set up your API key. If you haven't done this, please read the [Get Started](https://nixtla.github.io/nixtlar/articles/get-started.html) vignette first.

## 2. Load data

Expand All @@ -43,11 +43,11 @@ df_exo_vars <- nixtlar::electricity_exo_vars
head(df_exo_vars)
````

When using exogenous variables, `nixtlar` differentiates between historical and future exogenous variables:
When using exogenous variables, `nixtlar` distinguishes between historical and future exogenous variables:

- **Historical Exogenous Variables**: These should be included in the input data immediately following the `id_col`, `ds`, and `y` columns. If your dataset contains additional columns that are not exogenous variables, you must remove them before using any core functions of `nixtlar`.

- **Future Exogenous Variables**: These correspond to the `X_df` parameter and should cover the entire forecast horizon. This dataset should include columns with the appropriate timestamps and, if available, unique identifiers, formatted as explained in previous sections.
- **Future Exogenous Variables**: These correspond to the `X_df` parameter and should cover the entire forecast horizon. This dataset must include columns with the appropriate timestamps and, if applicable, unique identifiers.

````{r}
future_exo_vars <- nixtlar::electricity_future_exo_vars
Expand All @@ -63,10 +63,10 @@ fcst_exo_vars <- nixtla_client_forecast(df_exo_vars, h = 24, X_df = future_exo_v
head(fcst_exo_vars)
````

For comparison, we will also generate a forecast without the exogenous variables.
For comparison, we will also generate a forecast without exogenous variables.

````{r}
df <- nixtlar::electricity # same dataset but without the exogenous variables
df <- nixtlar::electricity # same dataset but without exogenous variables
fcst <- nixtla_client_forecast(df, h = 24)
head(fcst)
Expand Down
31 changes: 25 additions & 6 deletions vignettes/get-started.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -35,17 +35,26 @@ First, you need to set up your API key. An API key is a string of characters tha

When using `nixtlar`, there are two ways of setting up your API key:

### a. Using the `nixtla_set_api_key` function
### a. Using the `nixtla_client_setup` function
`nixtlar` has a function to easily set up your API key for your current R session. Simply call

```{r eval=FALSE}
nixtla_set_api_key(api_key = "paste your API key here")
nixtla_client_setup(api_key = "Your API key here")
```

Keep in mind that if you close your R session or you re-start it, then you'll need to set up your API key again.

When using Azure, you also need to add the `base_ur` parameter to the `nixtla_client_setup` function.

```{r eval=FALSE}
nixtla_client_setup(
base_url = "Base ULR",
api_key = "Your API key here"
)
```

### b. Using an environment variable
For a more persistent method that can be used across different projects, set up your API key as environment variable. To do this, you first need to load the `usethis` package.
For a more persistent method that can be used across different projects, set up your API key as environment variable. To do this, first load the `usethis` package.

```{r eval=FALSE, message=FALSE}
library(usethis)
Expand All @@ -56,10 +65,20 @@ This will open your `.Reviron` file. Place your API key here and named it `NIXTL

```{r eval=FALSE}
# Inside the .Renviron file
NIXTLA_API_KEY="paste your API key here"
NIXTLA_API_KEY="Your API key here"
```

You'll need to restart R for changes to take effect. Keep in mind that modifying the `.Renviron` file affects all of your R sessions, so if you're not comfortable with this, use the `nixtla_client_setup` function instead.

If you are using Azure, you also need to specify the `NIXTLA_BASE_URL`.

```{r eval=FALSE}
# Inside the .Renviron file
NIXTLA_BASE_URL="Base URL"
NIXTLA_API_KEY="Your API key here"
```

You'll need to restart R for changes to take effect. Keep in mind that modifying the `.Renviron` file affects all of your R sessions, so if you're not comfortable with this, set your API key using the `nixtla_set_api_key` function.
For details on how to set up your API key, check out the [Setting Up Your API Key](https://nixtla.github.io/nixtlar/articles/setting-up-your-api-key.html) vignette. To learn more about how to use Azure, please refer to the [TimeGEN-1 Quickstart (Azure)](vignette).

### Validate your API key
If you want to validate your API key, call `nixtla_validate_api_key`.
Expand Down Expand Up @@ -89,7 +108,7 @@ head(nixtla_client_fcst)
`nixtlar` includes a function to plot the historical data and any output from `nixtla_client_forecast`, `nixtla_client_historic`, `nixtla_client_anomaly_detection` and `nixtla_client_cross_validation`. If you have long series, you can use `max_insample_length` to only plot the last N historical values (the forecast will always be plotted in full).

```{r}
nixtla_client_plot(df, nixtla_client_fcst, id_col = "unique_id", max_insample_length = 200)
nixtla_client_plot(df, nixtla_client_fcst, max_insample_length = 200)
```

```{r, include=FALSE}
Expand Down
Loading

0 comments on commit 8cd89c3

Please sign in to comment.