carte_ai.data package
carte_ai.data.load_data module
- carte_ai.data.load_data. spotify()
-
Load and explore the Spotify dataset, which contains detailed information about over 600,000 Spotify tracks, including audio features, popularity metrics, and genres.
This dataset can be used for:
- Building a recommendation system based on user input or preferences.
- Classification tasks using audio features and genres.
- Any other applications involving music analysis and prediction.
Variables:
- track_id: Unique identifier for the track.
- artists: Names of the artists who performed the track (separated by ";").
- album_name: Name of the album.
- track_name: Name of the track.
- popularity: Popularity score (0–100).
- duration_ms: Length of the track in milliseconds.
- explicit: Whether the track contains explicit lyrics (true/false).
- danceability: Danceability score (0.0–1.0).
- energy: Energy score (0.0–1.0).
- key: Musical key of the track.
- loudness: Loudness in decibels (dB).
- mode: Modality of the track (major=1, minor=0).
- speechiness: Presence of spoken words (0.0–1.0).
- acousticness: Confidence measure for acoustic content (0.0–1.0).
- instrumentalness: Likelihood of being instrumental (0.0–1.0).
- liveness: Presence of audience (0.0–1.0).
- valence: Musical positiveness (0.0–1.0).
- tempo: Tempo in beats per minute (BPM).
- time_signature: Time signature (3–7).
- track_genre: Genre of the track.
Example Usage:
from carte_ai.data.load_data import *
num_train = 128 # Example: set the number of training groups/entities
random_state = 1 # Set a random seed for reproducibility
X_train, X_test, y_train, y_test = spotify(num_train, random_state)
# Print dataset shapes
print("Spotify dataset:", X_train.shape, X_test.shape)
- carte_ai.data.load_data. wina_pl()
-
Load and explore the Wina_PL dataset, which contains detailed information about wine prices and attributes in the Polish market.
This dataset is ideal for analysis and machine learning tasks related to wine classification, pricing, and preferences.
Variables:
- name: Name of the wine.
- country: Country of origin.
- region: Region where the wine is produced.
- appellation: Controlled origin label.
- vineyard: Vineyard producing the wine.
- vintage: Year of production.
- volume: Bottle volume in milliliters.
- ABV: Alcohol by volume (percentage).
- serving_temperature: Recommended serving temperature.
- wine_type: Type of wine (e.g., red, white).
- taste: Wine taste profile (e.g., dry, sweet).
- style: Style of the wine (e.g., full-bodied).
- vegan: Whether the wine is vegan-friendly.
- natural: Indicates if the wine is natural.
- grapes: Main grape varieties used.
Example Usage:
from carte_ai.data.load_data import *
num_train = 128 # Example: set the number of training groups/entities
random_state = 1 # Set a random seed for reproducibility
X_train, X_test, y_train, y_test = wina_pl(num_train, random_state)
# Print dataset shapes
print("Wina Poland dataset:", X_train.shape, X_test.shape)
For more details, visit the Kaggle dataset page.
- carte_ai.data.load_data. wine_vivino_price()
-
Load and explore the Wine Vivino Price dataset, which contains detailed information about wine ratings, prices, and attributes from the Vivino platform.
This dataset is ideal for analysis and machine learning tasks related to wine recommendation, pricing, and consumer preferences.
Variables:
- Winery: Name of the winery.
- Year: Vintage year of the wine.
- Wine ID: Unique identifier for the wine.
- Wine: Name of the wine.
- Rating: Average rating of the wine.
- num_review: Number of reviews for the wine.
- price: Price of the wine.
- Country: Country where the wine is produced.
- Region: Region of the winery.
Example Usage:
from carte_ai.data.load_data import *
num_train = 128 # Example: set the number of training groups/entities
random_state = 1 # Set a random seed for reproducibility
X_train, X_test, y_train, y_test = wine_vivino_price(num_train, random_state)
# Print dataset shapes
print("Wine Vivino Price dataset:", X_train.shape, X_test.shape)
For more details, visit the dataset page.