Orbit 是 Uber 开发的一个开源 Python 包,用于贝叶斯时间序列预测。它旨在处理实际预测场景,具备趋势分析、季节性建模和不确定性量化等功能。Orbit 支持贝叶斯结构时间序列(BSTS)和广义加性模型(GAM)等模型,使其能够灵活应对各种时间序列问题。
Orbit 提供了一种贝叶斯时间序列预测方法,内置支持不确定性量化和趋势及季节性分解。它适用于单变量和多变量时间序列,且其 API 使用简单。
与所有贝叶斯方法一样,它的计算量较大。因此,与 ARIMA 相比,获取预测结果的时间可能会更长。
你可以使用 pip 安装 Orbit:
pip install orbit-ml
我使用的是受Jason Brownlee启发的太阳黑子数据,数据来源已更新为比利时布鲁塞尔的世界数据中心太阳影响数据(WDC-SILSO),即比利时皇家天文台。
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
from orbit.utils.dataset import load_iclaims
from orbit.models import DLT
from orbit.diagnostics.plot import plot_predicted_data
from orbit.diagnostics.metrics import smape
from prophet import Prophet
# Define RMSE function
def rmse(actual, predicted):
return np.sqrt(np.mean((actual - predicted) ** 2))
# Load sample data
data = pd.read_csv("SN_m_tot_V2.0.csv")
data["Month"] = pd.to_datetime(data["Month"])
# Create separate dataframes for Orbit and Prophet with correct column names
orbit_data = data.copy()
prophet_data = data.copy()
# Rename columns for Orbit
orbit_data = orbit_data.rename(columns={
"Month": "date",
"Sunspot": "response" # Changed from "value" to "response"
})
# Rename columns for Prophet
prophet_data = prophet_data.rename(columns={
"Month": "ds",
"Sunspot": "y"
})
# Split the data
train_size = len(data) - 48
# Orbit train/test split
orbit_train = orbit_data.iloc[:train_size]
orbit_test = orbit_data.iloc[train_size:]
# Prophet train/test split
prophet_train = prophet_data.iloc[:train_size]
prophet_test = prophet_data.iloc[train_size:]
# --- Orbit Model ---
model_orbit = DLT(
response_col="response",
date_col="date",
seasonality=12,
)
# Fit Orbit model
model_orbit.fit(df=orbit_train)
predictions_orbit = model_orbit.predict(df=orbit_data)
# --- Prophet Model ---
# Initialize and fit Prophet model
model_prophet = Prophet(yearly_seasonality=True)
model_prophet.fit(prophet_train)
# Make predictions with Prophet
future = prophet_data[['ds']]
predictions_prophet = model_prophet.predict(future)
# Create subplot for both models
fig, (ax1, ax2) = plt.subplots(2, 1, figsize=(15, 12))
# Plot Orbit results
ax1.plot(orbit_data['date'], orbit_data['response'], label='Actual', alpha=0.5)
ax1.plot(predictions_orbit['date'], predictions_orbit['prediction'], label='Predicted', color='red')
ax1.fill_between(predictions_orbit['date'],
predictions_orbit['prediction_5'],
predictions_orbit['prediction_95'],
color='red',
alpha=0.1)
ax1.axvline(x=orbit_train['date'].iloc[-1], color='black', linestyle='--', label='Train/Test Split')
ax1.legend()
ax1.set_title('Orbit Model Forecast')
# Plot Prophet results
ax2.plot(prophet_data['ds'], prophet_data['y'], label='Actual', alpha=0.5)
ax2.plot(predictions_prophet['ds'], predictions_prophet['yhat'], label='Predicted', color='green')
ax2.fill_between(predictions_prophet['ds'],
predictions_prophet['yhat_lower'],
predictions_prophet['yhat_upper'],
color='green',
alpha=0.1)
ax2.axvline(x=prophet_train['ds'].iloc[-1], color='black', linestyle='--', label='Train/Test Split')
ax2.legend()
ax2.set_title('Prophet Model Forecast')
plt.tight_layout()
plt.savefig("timeseries_comparison.png")
plt.show()
# Calculate metrics for test set - Orbit
test_predictions_orbit = predictions_orbit.iloc[train_size:]
test_actual = orbit_test['response']
print("\nOrbit Test Set Metrics:")
print("SMAPE:", smape(test_actual, test_predictions_orbit['prediction']))
print("RMSE:", rmse(test_actual, test_predictions_orbit['prediction']))
# Calculate metrics for test set - Prophet
test_predictions_prophet = predictions_prophet.iloc[train_size:]
print("\nProphet Test Set Metrics:")
print("SMAPE:", smape(test_actual, test_predictions_prophet['yhat']))
print("RMSE:", rmse(test_actual, test_predictions_prophet['yhat']))
# Show prediction intervals for both models
print("\nOrbit Prediction Intervals:")
print(predictions_orbit[["prediction", "prediction_5", "prediction_95"]].head())
print("\nProphet Prediction Intervals:")
print(predictions_prophet[["yhat", "yhat_lower", "yhat_upper"]].head())
我创建了一个阻尼局部趋势(DLT)模型。该模型通过引入长期趋势的阻尼,扩展了局部线性趋势(LLT)模型。
由于这是月度数据,所以季节性设置为12。
Orbit为每个预测提供了可信区间,使你能够评估预测的不确定性。
# Extract prediction intervals
predictions[["prediction", "prediction_5", "prediction_95"]].head()
评估指标
我在相同的数据上运行了Orbit和Prophet。
Orbit Test Set Metrics:
SMAPE: 1.2577330909302515
RMSE: 99.90859640412387
Prophet Test Set Metrics:
SMAPE: 0.6695273556170346
RMSE: 64.91185978677316
显然,两个模型的预测都不准确。Prophet的表现更好一些。Orbit模型的不确定性范围非常大。
Orbit是一个贝叶斯预测库,当理解预测不确定性很重要时,它非常有用,比如需求预测、财务规划和政策影响分析。虽然Orbit比更简单的预测方法需要更多的计算资源,但当你专注于量化不确定性时,其贝叶斯框架和功能集使其成为一个不错的选择。