CNS 24hr = ...
 CNS 4hr = ...
 BTCUSDT = ...
 ETHUSDT = ...

Machine Learning for Financial Trading

Follow the latest news and updates from the us.

Sukrit Sunama 
Sukrit Sunama

@sunama.sukrit,  
DALL-E-2025-03-17-11.06.12.webp

Understanding AI Trading Agents: How Reinforcement Learning Powers Automated Trading

Revised & Improved Version:

When people hear "AI," they often think of ChatGPT or other generative AI models categorized as LLMs (Large Language Models). However, a new AI paradigm is gaining attention—Agent AI, which I’ll explore in this article.

The Four Categories of AI

Before diving into Agent AI, let's first classify AI based on how it is trained and its purpose:

  1. Supervised AI – This type of AI learns from labeled data, meaning humans (or other intelligent entities) provide direct input-output mappings. Common examples include email spam classification and image recognition. Interestingly, the first version of ChatGPT was a supervised AI model, requiring human-written questions and answers for training.
  2. Unsupervised AI – Unlike supervised AI, this category doesn't require manually labeled data. It learns from patterns in raw data. Examples include image recognition for identifying individuals and the latest versions of ChatGPT, which were trained on large-scale Q&A data extracted from web forums.
  3. Semi-Supervised AI – This hybrid approach uses both labeled and unlabeled data. A good example is recommendation systems that suggest videos or products based on user interactions.
  4. Reinforcement Learning (RL) – This is where Agent AI comes into play. RL differs from other AI types because it focuses on learning complex actions while optimizing for an objective.

Reinforcement Learning & AI Agents

The foundation of RL is built on multiple theories, enabling its real-world applications in autonomous vehicles, aircraft autopilots, and trading agents. The key concepts behind RL include:

  1. Markov Decision Process (MDP) – A framework for making optimal decisions in uncertain environments by analyzing sequential data.
  2. Multi-Armed Bandit Problem – A decision-making problem where an agent must choose the best action from a fixed set. In trading, for example, an agent may decide between holding, buying, or selling at each time step.
  3. Artificial Neural Networks (ANNs) – A fundamental component of modern AI models, simulating the human brain’s decision-making process using multiple linear equations. ANNs allow AI to handle complex decision-making tasks.

Training an AI Trading Agent

For an RL-based trading agent, rewards act as fuel, guiding the AI toward its objective. The most basic reward function is ROI (Return on Investment) since maximizing profit is the primary goal. However, using ROI alone can lead to overfitting, where the agent performs well in training but fails in real-world trading. To prevent overfitting, additional penalties and rewards must be incorporated into the reward function, such as:

  • Overtrading penalty – To discourage excessive trading.
  • Drawdown penalty – To reduce exposure to high-risk scenarios.
  • Sharpe ratio reward/penalty – To encourage stable, risk-adjusted returns.

Can Trading Agents Generate Buy/Sell Signals?

Many traders expect AI trading agents to generate explicit buy/sell signals. However, most RL-based trading agents follow a multi-armed bandit approach, meaning they can only take predefined actions like hold, buy, or sell.

To generate buy/sell signals with target prices, we could introduce additional actions, such as:

  • Buy signal at +2.5% price increase
  • Buy signal at +5% price increase
  • Buy signal at +10% price increase
  • Sell signal at -2.5% price decrease
  • Sell signal at -5% price decrease
  • Sell signal at -10% price decrease
  • Hold

However, increasing the number of actions requires significantly more data and training steps, making the model more complex. In practice, using an agent for direct auto-trading is often a better approach than relying on discrete trading signals.

Key Inputs for AI Trading Agents

For training a trading agent, I recommend starting with technical indicators and time-based features. As you advance, you can incorporate feature engineering, news sentiment analysis, and correlated assets to improve performance.

Conclusion

AI trading agents leverage reinforcement learning to make buy, sell, or hold decisions based on trained data. While they excel at fully automated trading, they aren't typically designed for sending explicit trade signals unless specifically programmed to do so. With the right input features, reward functions, and risk management strategies, AI agents can offer a powerful, automated approach to crypto trading.

Sukrit Sunama 
Sukrit Sunama

@sunama.sukrit,  
DALL-E-2025-03-13-10.49.54.webp

Building Statera: My 3-Year Journey to Creating a Profitable Crypto Trading Agent

This is the story of how I built Statera, my first AI-powered trading agent. But before diving into the journey, let’s first understand what a Statera agent is.

What is Statera?

Statera is an AI trading agent designed for spot trading BTCUSDT. Its primary goal is to generate profits exceeding the simple buy-and-hold strategy while managing risks effectively. Sounds simple, right? But in reality, it took me three years to develop a version that consistently generates profits while minimizing losses.

The Beginning: A Flawed Approach (2022)

In 2022, I had the idea of using AI to trade crypto instead of relying on manual strategies, which felt no different from gambling. I began studying algorithmic trading and quickly discovered that technical analysis was widely regarded as the key to success.

However, after extensive manual and algorithmic trading, I realized something was wrong. Despite numerous books, research papers, and tutorials claiming that technical analysis works, I couldn’t find a sustainable, repeatable strategy. It took me three years to fully grasp this fundamental issue:

Some indicators work, but most are ineffective. Correlation analysis is essential for identifying useful ones.

Backtesting is crucial, but the length of your dataset determines the reliability of your strategy. Many traders lack sufficient historical data, leading to unreliable models.

Single-timeframe trading is problematic, especially for short timeframes (1m, 5m, 15m). These are highly volatile, requiring longer historical windows or multi-timeframe analysis.

Because of these misconceptions, I wasted enormous time and computing resources training ineffective agents. By the end of 2022, after countless failed experiments, I nearly gave up. I wrongly concluded that technical analysis itself was the problem. Despite experimenting with countless indicators, window sizes, and neural network architectures, nothing worked.

The AI Renaissance: A Fresh Perspective (2024)

Fast forward to 2024—AI had advanced across industries, from image processing to autonomous vehicles. Inspired by this progress, I revisited my trading agent with fresh insights. I realized my failure wasn’t due to AI itself but because my approach was flawed.

The key issue? Overfitting.

My agent performed exceptionally well on training data but failed in real-world scenarios. Fixing this problem became my primary focus throughout 2024. While I don’t remember the exact sequence of my discoveries, here are the most critical lessons I learned:

Key Lessons from 2024

  1. Input Sensitivity Matters - Agents are highly sensitive to inputs, making input diversification crucial. However, real-world data limitations exist. For example, a 1-hour timeframe dataset with 100,000 rows covers only 11.41 years—a minimum requirement for training. My initial solution of training across multiple assets failed. Instead, adding Gaussian noise to my limited dataset (50,000 rows) improved generalization.

  2. Neural Network Complexity ≠ Better Performance - Initially, I assumed that increasing model complexity would yield better results. I wasted significant time tweaking hidden layers, node sizes, and hyperparameters. Instead of blindly randomizing parameters, systematic experimentation proved far more effective in finding the optimal balance between underfitting and overfitting.

  3. The Market Has Seasonal Patterns - Asset prices fluctuate based on time. Incorporating time as an input feature significantly improved my agent’s performance.

  4. “Reward per Step” is the Best Performance Indicator - The most effective way to measure model fitness isn’t raw profitability but rather reward per step. Crafting an optimal reward function is a complex topic I’ll cover in a future article.

  5. Hyperparameter Tuning Requires Multi-Stage Training - Standard reinforcement learning (RL) tutorials often use a single set of hyperparameters. However, I found that splitting training into 2–5 phases, with gradual hyperparameter adjustments, yielded better models.

Statera’s Current Status (March 2025)

As of March 2025, Statera has been trained on BTCUSDT for approximately 4 million steps—equivalent to 456 years of historical data. The architecture consists of:

Features: Technical indicators + time-based inputs

Model: LSTM for sequential data, followed by a linear model

Training Method: PPO (Proximal Policy Optimization)

Performance Results

After testing the agent with a limited 3-month dataset, here’s what I found:

  • Profitability: 20–40% gains over 3 months (Oct 2024–Jan 2025)

  • Market Benchmark: BTCUSDT returned 21% during the same period

  • Consistency: PPO does not produce deterministic actions but instead outputs probability distributions. Across multiple backtests, Statera consistently outperformed buy-and-hold strategies.

Given these results, I decided to release the agent for further testing under real-market conditions.

The Future: A Flagship 1-Minute Trading Model

My next goal is to develop a flagship model that trades on a 1-minute timeframe using multi-timeframe inputs. The process is ongoing, and I’ll share more insights as I refine the model. Stay tuned for future updates on my blog!

Thanks for reading!

Statera 2025.03

This agent is designed to maximize profit for spot trading while balancing profit and risk. It uses the latest 48 hours of price data and the current state of the agent. Inputs include current time and indicators such as RSI, Stoch, ADX, ATR, OBV, and Bol

Sukrit Sunama 
Sukrit Sunama

@sunama.sukrit,  
btc-4h-month-change.png

"Unveiling Bitcoin’s Seasonal Patterns: Predictable Price Trends from 2018 to 2024"

When I first observed this pattern, I felt a chill run down my spine—it was hard to believe how accurately the month impacts Bitcoin's price movements. But trust me, data doesn't lie. In this article, I'll walk you through fascinating insights I’ve uncovered through data science.

BTC 4h, Month and %Change between 2018 - 2024

Sukrit Sunama 
Sukrit Sunama

@sunama.sukrit,  
Screenshot from 2024-10-25 08-34-17.png

What is CNS on the top of this website?

Crypto News Sentiment (CNS) is a crucial indicator for cryptocurrency traders. It offers real-time insights into the overall sentiment surrounding the crypto market, derived from a comprehensive analysis of news articles.

In this post, I'll delve into:

  • How CNS is calculated using advanced AI techniques.
  • The significance of CNS as an independent variable influencing market prices.
  • The practical applications of CNS for informed trading decisions.

By understanding CNS, you can gain a competitive edge and make more strategic trades in the dynamic crypto market. To calculate Crypto News Sentiment (CNS), a sophisticated AI bot scours numerous crypto-related news sources. This bot analyzes the content of these articles, extracting sentiment values categorized as positive, negative, or neutral. To provide a comprehensive overview, the individual sentiment values are then aggregated into a single score ranging from -1 (strongly negative) to 1 (strongly positive). This aggregated score represents the prevailing sentiment within the crypto news landscape.

While initial skepticism about the relationship between sentiment analysis and market price was warranted, extensive research and experimentation, including reinforcement learning models, have conclusively demonstrated that sentiment is indeed an independent variable. This means that news sentiment often precedes and influences market price movements, rather than simply reacting to them. This insight underscores the predictive power of CNS as a valuable trading tool.

At the top corner, you'll find two CNS values: a 24-hour value and a 4-hour value. These values represent the sentiment aggregated over the past 24 hours and 4 hours, respectively. This allows for a comparison of short-term and longer-term sentiment trends, providing valuable insights into market volatility and potential price movements

The potential applications of sentiment analysis in the crypto market are vast and exciting. As I continue my research, I'll be sharing additional insights and discoveries on this website. Stay tuned for updates and leverage the power of CNS to make informed trading decisions..