Hi,šŸ‘‹ we have updated the app and fixed multiple bugs. We are lacking funds, request to free user not to use Adblock. Ads are non intrusive. 😊

Carousel Studio

Repurpose X Threads into LinkedIn & Instagram Carousels

Thread Truncated (Cap Enforced)

Only the first 20 tweets are unrolled into slides to ensure reliable PDF exporting and high server performance.

Canvas & Ratio

Choose your destination platform format


Layout Template

Choose a content structure for your slides


Preset Themes


Typography & Sizing

Title Font Size36px
Body Font Size18px
Header & Footer Size12px

Brand Kit Customization

AGENCY

Configure brand assets for headers & footers

MULTI-PROFILES (AGENCY)
AGENCY
SAVE PRESETS (AGENCY)

Outro Slide CTA

Customize your closing call-to-action slide

#1
#2
#3

Background Pattern

Source Content

Build Your Carousel

Drag and drop any post card below onto a slide, or use the quick buttons to insert content/images instantly!

Drag Post #1
zostaff
@zostaff

## <b>Complete Technical Guide: From Data Collection to a Working Model</b>

Apply Image
Drag Post #2
zostaff
@zostaff

This article provides a step-by-step walkthrough of the architecture and implementation of an analytical system for predicting football match outcomes. The system uses Claude API from Anthropic as its "brain" - for data interpretation, feature engineering, and generating final predictions. The key innovation is <b>combining three probability layers</b>: bookmaker odds (Bet365), <b>Polymarket</b> prediction market data (blockchain-based crowd intelligence), and a custom ML model. The entire pipeline is written in Python using pandas, scikit-learn, XGBoost, and matplotlib.

Drag Post #3
zostaff
@zostaff

## <b>System Architecture</b>

Drag Post #4
zostaff
@zostaff

<b><i>The system consists of several layers, each serving a specific role:</i></b><i></i>

Drag Post #5
zostaff
@zostaff

<pre><code lang="">ā”Œā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā” │ DATA LAYER │ │ football-data.co.uk │ API-Football │ FBref │ Kaggle │ │ │ │ ā”Œā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā” │ │ │ šŸ”— Polymarket Gamma API (prediction market) │ │ │ │ Crowd-sourced probabilities on the Polygon chain │ │ │ ā””ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”˜ │ ā””ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”¬ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”˜ │ ā–¼ ā”Œā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā” │ PROCESSING LAYER │ │ pandas │ numpy │ data cleaning │ feature engineering │ │ │ │ ā”Œā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā” │ │ │ Claude API: feature generation, │ │ │ │ context analysis, statistics interpretation │ │ │ ā””ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”˜ │ │ │ │ ā”Œā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā” │ │ │ Merging 3 probability layers: │ │ │ │ Bookmaker odds + Polymarket prices + ML model │ │ │ ā””ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”˜ │ ā””ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”¬ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”˜ │ ā–¼ ā”Œā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā” │ MODEL LAYER │ │ Logistic Regression │ Random Forest │ XGBoost │ │ Ensemble (Voting / Stacking) │ ā””ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”¬ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”˜ │ ā–¼ ā”Œā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā” │ INTERPRETATION LAYER │ │ Claude API: natural language prediction explanation │ │ + confidence assessment + divergence analysis │ │ between bookmaker / Polymarket / ML │ ā””ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”¬ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”˜ │ ā–¼ ā”Œā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā” │ OUTPUT LAYER │ │ matplotlib visualizations │ JSON reports │ Telegram bot │ ā””ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”˜</code></pre>

Drag Post #6
zostaff
@zostaff

<b>Required Dependencies</b>

Drag Post #7
zostaff
@zostaff

<pre><code lang="python"># requirements.txt anthropic&gt;=0.40.0 pandas&gt;=2.1.0 numpy&gt;=1.24.0 scikit-learn&gt;=1.3.0 xgboost&gt;=2.0.0 matplotlib&gt;=3.8.0 seaborn&gt;=0.13.0 requests&gt;=2.31.0 python-dotenv&gt;=1.0.0 schedule&gt;=1.2.0 # pipeline automation</code></pre>

Drag Post #8
zostaff
@zostaff

<b>Installation:</b>

Drag Post #9
zostaff
@zostaff

<pre><code lang="bash">pip install anthropic pandas numpy scikit-learn xgboost matplotlib seaborn requests python-dotenv schedule</code></pre>

Drag Post #10
zostaff
@zostaff

<i>The Polymarket Gamma API does not require a dedicated SDK - all requests are made via `requests` to public REST endpoints without authentication.</i>

Drag Post #11
zostaff
@zostaff

## <b>Data Collection and Preparation</b>

Drag Post #12
zostaff
@zostaff

<i>The primary data source is <b>football-data.co.uk</b>, which provides CSV files with match results and statistics for all major European leagues. The data includes goals, shots, corners, fouls, cards, and bookmaker odds.</i>

Drag Post #13
zostaff
@zostaff

<b><i>Data Loading</i></b><i></i>

Drag Post #14
zostaff
@zostaff

<pre><code lang="python">import pandas as pd import numpy as np from pathlib import Path class FootballDataLoader: """ Historical football match data loader. Source: football-data.co.uk """ BASE_URL = "https://www.football-data.co.uk/mmz4281" LEAGUES = { "E0": "Premier League", "SP1": "La Liga", "D1": "Bundesliga", "I1": "Serie A", "F1": "Ligue 1", } COLUMNS_TO_KEEP = [ "Date", "HomeTeam", "AwayTeam", "FTHG", "FTAG", "FTR", # Final score and result "HTHG", "HTAG", "HTR", # Half-time score "HS", "AS", # Shots "HST", "AST", # Shots on target "HF", "AF", # Fouls "HC", "AC", # Corners "HY", "AY", # Yellow cards "HR", "AR", # Red cards "B365H", "B365D", "B365A", # Bet365 odds ] def __init__(self, seasons: list[str], leagues: list[str] = None): self.seasons = seasons # format: ["2324", "2223", "2122"] self.leagues = leagues or list(self.LEAGUES.keys()) def load_season(self, league: str, season: str) -&gt; pd.DataFrame: """Load data for a single season and league.""" url = f"{self.BASE_URL}/{season}/{league}.csv" try: df = pd.read_csv(url, encoding="utf-8", on_bad_lines="skip") available_cols = [c for c in self.COLUMNS_TO_KEEP if c in df.columns] df = df[available_cols].dropna(subset=["HomeTeam", "AwayTeam", "FTR"]) df["League"] = self.LEAGUES.get(league, league) df["Season"] = season return df except Exception as e: print(f"Error loading {league}/{season}: {e}") return pd.DataFrame() def load_all(self) -&gt; pd.DataFrame: """Load all data for specified leagues and seasons.""" frames = [] for league in self.leagues: for season in self.seasons: df = self.load_season(league, season) if not df.empty: frames.append(df) print(f" āœ“ {self.LEAGUES.get(league)}, season {season}: " f"{len(df)} matches") result = pd.concat(frames, ignore_index=True) print(f"\nTotal loaded: {len(result)} matches") return result # === Usage === loader = FootballDataLoader( seasons=["2425", "2324", "2223", "2122", "2021"], leagues=["E0", "SP1", "D1"] # EPL, La Liga, Bundesliga ) raw_data = loader.load_all()</code></pre>

Drag Post #15
zostaff
@zostaff

<b><i>Cleaning and Transformation</i></b><i></i>

Drag Post #16
zostaff
@zostaff

<pre><code lang="python">class DataCleaner: """Data cleaning and standardization.""" @staticmethod def clean(df: pd.DataFrame) -&gt; pd.DataFrame: df = df.copy() # Standardize date format df["Date"] = pd.to_datetime(df["Date"], dayfirst=True, errors="coerce") df = df.dropna(subset=["Date"]) df = df.sort_values("Date").reset_index(drop=True) # Numeric columns numeric_cols = [ "FTHG", "FTAG", "HTHG", "HTAG", "HS", "AS", "HST", "AST", "HF", "AF", "HC", "AC", "HY", "AY", "HR", "AR", "B365H", "B365D", "B365A", ] for col in numeric_cols: if col in df.columns: df[col] = pd.to_numeric(df[col], errors="coerce") # Encoding result: H=2, D=1, A=0 result_map = {"H": 2, "D": 1, "A": 0} df["Result"] = df["FTR"].map(result_map) df = df.dropna(subset=["Result"]) df["Result"] = df["Result"].astype(int) return df clean_data = DataCleaner.clean(raw_data) print(f"After cleaning: {len(clean_data)} matches") print(f"Result distribution:\n{clean_data['FTR'].value_counts()}")</code></pre>

Drag Post #17
zostaff
@zostaff

## <b>Feature Engineering with Claude</b>

Drag Post #18
zostaff
@zostaff

This is the key stage where we create features that enable the model to "understand" the match context. Here, Claude serves as an intelligent assistant - helping generate feature ideas and evaluate contextual factors.

Drag Post #19
zostaff
@zostaff

<b><i>Statistical Features (Rolling Averages)</i></b><i></i>

Drag Post #20
zostaff
@zostaff

<pre><code lang="python">class FeatureEngineer: """ Feature generation based on historical team statistics. Key idea: for each match we use ONLY data available BEFORE the match starts. """ def __init__(self, window: int = 5): self.window = window def compute_team_stats(self, df: pd.DataFrame) -&gt; pd.DataFrame: """ Compute rolling averages for each team over the last N matches. """ df = df.sort_values("Date").copy() # Create separate records for home and away teams home_records = df[["Date", "HomeTeam", "FTHG", "FTAG", "HS", "AS", "HST", "AST", "HC", "AC", "HF", "AF"]].copy() home_records.columns = ["Date", "Team", "GF", "GA", "Shots", "ShotsAgainst", "SoT", "SoTAgainst", "Corners", "CornersAgainst", "Fouls", "FoulsAgainst"] home_records["IsHome"] = 1 away_records = df[["Date", "AwayTeam", "FTAG", "FTHG", "AS", "HS", "AST", "HST", "AC", "HC", "AF", "HF"]].copy() away_records.columns = home_records.columns away_records["IsHome"] = 0 all_records = pd.concat([home_records, away_records]) all_records = all_records.sort_values("Date") # Calculate rolling averages per team stats_cols = ["GF", "GA", "Shots", "ShotsAgainst", "SoT", "SoTAgainst", "Corners", "CornersAgainst", "Fouls", "FoulsAgainst"] rolling_stats = {} for team in all_records["Team"].unique(): team_data = all_records[all_records["Team"] == team].copy() for col in stats_cols: # shift(1) — to exclude the current match team_data[f"avg_{col}"] = ( team_data[col] .shift(1) .rolling(window=self.window, min_periods=3) .mean() ) # Form: average points over last N matches team_data["Points"] = team_data.apply( lambda r: 3 if r["GF"] &gt; r["GA"] else (1 if r["GF"] == r["GA"] else 0), axis=1, ) team_data["Form"] = ( team_data["Points"] .shift(1) .rolling(window=self.window, min_periods=3) .mean() ) rolling_stats[team] = team_data return pd.concat(rolling_stats.values()) def build_match_features(self, df: pd.DataFrame) -&gt; pd.DataFrame: """ Join home and away team statistics for each match. """ team_stats = self.compute_team_stats(df) stat_features = [c for c in team_stats.columns if c.startswith("avg_")] stat_features.append("Form") features_list = [] for idx, match in df.iterrows(): home = match["HomeTeam"] away = match["AwayTeam"] date = match["Date"] home_stats = team_stats[ (team_stats["Team"] == home) &amp; (team_stats["Date"] == date) &amp; (team_stats["IsHome"] == 1) ] away_stats = team_stats[ (team_stats["Team"] == away) &amp; (team_stats["Date"] == date) &amp; (team_stats["IsHome"] == 0) ] if home_stats.empty or away_stats.empty: continue row = {"match_idx": idx} for feat in stat_features: h_val = home_stats[feat].values[0] a_val = away_stats[feat].values[0] row[f"home_{feat}"] = h_val row[f"away_{feat}"] = a_val # Difference — one of the strongest features row[f"diff_{feat}"] = h_val - a_val features_list.append(row) features_df = pd.DataFrame(features_list).set_index("match_idx") result = df.join(features_df, how="inner") return result.dropna(subset=[c for c in features_df.columns]) # === Usage === engineer = FeatureEngineer(window=5) featured_data = engineer.build_match_features(clean_data) print(f"Matches with features: {len(featured_data)}") print(f"Number of features: {len([c for c in featured_data.columns if c.startswith(('home_', 'away_', 'diff_'))])}")</code></pre>