Google I/O 2026 starts on May 19, and while we already have a pretty good idea of what to expect, there’s plenty of room for surprises. The tech giant has been all-in on AI for the past few years, and that probably won’t change, but there could be a few hardware announcements on tap this year.
From Android XR glasses to hearing more about Aluminum OS, there’s a lot to look forward to. Below, we’ll fill you in on what we expect Google to talk about during the I/O keynote.
More AI features
We expect Google to announce several new artificial intelligence features that integrate further into its products. Now that agentic AI is all the rage, we’ll most likely see Google lean even further in this direction. This type of AI can perform tasks on your behalf, like controlling your computer, with minimal oversight. We’ll have to wait and see what and how many AI features Google announces this time.
Let’s also not exclude updates to existing or new products that Google could announce. Veo, Lyria, Beam and countless others could get some spotlight at this year’s conference.
Veo and Lyria are Google’s AI-generated video and music tools, respectively, and have continued to improve since they were originally announced. Beam is an ambitious and futuristic way of video conferencing that uses several cameras to make you appear as though you’re speaking directly to the person in front of you as a 3D model.
Gemini 4.0
The next generation Gemini is likely going to be announced at Google I/O 2026
Thomas Fuller/Getty Images
Among all the AI announcements, we’re expecting Google to spend a significant amount of time talking about its flagship AI model for Gemini. Whether it gets a solid 4.0 status or something like a 3.8, we know the new version of Gemini will likely be one of the biggest announcements of Google I/O 2026.
Exactly what Google has been working on with Gemini is anyone’s guess. It’s easy enough to assume that the latest model will be smarter and faster than previous models, but Gemini itself is in nearly every Google product these days, so how the latest and greatest AI from Google trickles down will be interesting to see.
Google recently released a new notebooks feature for Gemini that will let you store sources for a particular topic in one place for easy access. Notebooks are self-contained databases full of sources on a particular topic that you can continue to add to. Gemini will use a notebook for context, so you don’t have to start all over again with information sources.
Those notebooks also sync directly with Google’s AI research assistant NotebookLM, allowing you to create a host of different outputs, like video overviews, charts and more. One of the main differentiators between NotebookLM and Gemini is that NotebookLM will only use your notebook as the source of truth, whereas Gemini will scour the internet with the notebook’s context for the search.
Gemini can also now create dynamic and interactive simulations directly in your chats when you ask it to “show you” or “visualize” something.
Google hasn’t slowed its rollout of Gemini features, so a lot more are likely on their way with the latest version of the AI model.
Android XR Glasses
Android XR will most certainly steal some of the spotlight during this year’s I/O conference.
Andrew Lanxon/CNET
Google showed off its Android XR glasses at last year’s I/O, along with a few partnerships it formed to create them, so we’ll likely see the smart glasses become more of a product than a concept this year.
Smart glasses are gaining popularity, and Google took awhile getting back into the space after its first swing in the sector. Google Glass was way ahead of its time, but from the demos we’ve seen of Android XR, that patience may have paid off.
Google’s first set of “smart glasses” back in 2013 was an obvious pair of spectacles with a protruding lens that the wearer could view information on, and even take photos and record video. The product was met with immediate and significant pushback as an invasion of privacy, as well as being elitist and rude. This eventually resulted in the term for the wearers as “Glassholes.”
A lot has changed since the introduction of Google Glass, and Android XR glasses won’t look nearly as obvious when released, which could make it even creepier, but at least they’ll come with a load of usable features like heads-up notifications, live translation and Gemini Live. They’re also launching into an established market now, with smart glasses competitors from Meta’s collaborations with Ray-Ban, Oakley and more. Samsung’s own Galaxy XR headset runs on the Android XR platform and is already available to purchase. This first piece of hardware running on the platform paves the way for more hardware, with smart glasses being a natural next step.
Google I/O could bring us more demos, final hardware details and a release date for when you’ll be able to get Android XR glasses in your hands. Given that there are multiple partners in the ring, the price ranges could vary, potentially offering both entry-level and high-end offerings.
Android 17
Google/Screenshot by CNET
Android is Google’s playground for showcasing the best of its AI features, though some of them may be exclusive to the new Pixel phones we expect to see later this year.
Google released the first beta version of Android 17, its phone-operating system, back in February, and three additional betas have been released since, with the latest in mid-April. We can expect the latest version of the OS to be released in its final form sometime in June or July, shortly before we expect the next family of Pixel devices to be announced. For the past few years, the new Pixel lineup has been announced in August during the Made by Google event.
So far, there are no blockbuster features in the Android 17 beta, but Google has introduced interesting tweaks throughout. One of the most interesting features so far is app bubbles, which allows you to quickly access any app in a floating window and dismiss it to a bubble on your screen.
Last year, Google separated its Android announcements into a separate show a week before its I/O conference: The Android Show. This allowed Google to spend more time talking about AI without sacrificing the announcements it had on tap for Android. Whether The Android Show will return this year remains to be seen — though reportedly, a YouTube placeholder for the event was accidentally set live last week before being taken down.
Aluminum OS
One of the most interesting projects Google has been cooking up is a new operating system that merges Android and ChromeOS. Dubbed Aluminum OS, the product will bring Android to laptops and other devices with the full Chrome web browsing experience.
When exactly we’ll see hardware for the new OS is still unknown, but Google could surprise us with partnership announcements or even a full product announcement at I/O this year. The return of a Google-made Pixelbook doesn’t seem out of the realm of possibility, either.
Merging both of Google’s operating systems will likely bring a more seamless software experience between how AluminumOS computers and Android phones interact.
When we launched the Equities Entity Store in Mathematica, it revolutionized how financial professionals interact with market data by bringing semantic structure, rich metadata, and analysis-ready information into a unified framework. Mathematica’s EntityStore provided an elegant way to explore equities, ETFs, indices, and factor models through a symbolic interface. However, the industry landscape has evolved—the majority of quantitative finance, data science, and machine learning now thrives in Python.
While platforms like FactSet, WRDS, and Bloomberg provide extensive financial data, quantitative researchers still spend up to 80% of their time wrangling data rather than building models. Current workflows often involve downloading CSV files, manually cleaning them in pandas, and stitching together inconsistent time series—all while attempting to avoid subtle lookahead bias that invalidates backtests.
Recognizing these challenges, we’ve reimagined the Equities Entity Store for Python, focusing first on what the Python ecosystem does best: scalable machine learning and robust data analysis.
The Python Version: What’s New
Rather than beginning with metadata-rich entity hierarchies, the Python Equities Entity Store prioritizes the intersection of high-quality data and predictive modeling capabilities. At its foundation lies a comprehensive HDF5 dataset containing over 1,400 features for 7,500 stocks, measured monthly from 1995 to 2025—creating an extensive cross-sectional dataset optimized for sophisticated ML applications.
Our lightweight, purpose-built package includes specialized modules for:
Feature loading: Efficient extraction and manipulation of data from the HDF5 store
Feature preprocessing: Comprehensive tools for winsorization, z-scoring, neutralization, and other essential transformations
Label construction: Flexible creation of target variables, including 1-month forward information ratio
Ranking models: Advanced implementations including LambdaMART and other gradient-boosted tree approaches
Portfolio construction: Sophisticated tools for converting model outputs into actionable investment strategies
Backtesting and evaluation: Rigorous performance assessment across multiple metrics
Guaranteed Protection Against Lookahead Bias
A critical advantage of our Python Equities Entity Store implementation is its robust safeguards against lookahead bias—a common pitfall that compromises the validity of backtests and predictive models. Modern ML preprocessing pipelines often inadvertently introduce information from the future into training data, leading to unrealistic performance expectations.
Unlike platforms such as QuantConnect, Zipline, or even custom research environments that require careful manual controls, our system integrates lookahead protection at the architectural level:
# Example: Time-aware feature standardization with strict temporal boundaries
from equityentity.features.preprocess import TimeAwareStandardizer
# This standardizer only uses data available up to each point in time
standardizer = TimeAwareStandardizer(lookback_window='60M')
zscore_features = standardizer.fit_transform(raw_features)
# Instead of the typical approach that inadvertently leaks future data:
# DON'T DO THIS: sklearn.preprocessing.StandardScaler().fit_transform(raw_features)
Multiple safeguards are integrated throughout the system:
Point-in-time data snapshots: Features reflect only information available at the decision point
New listing delay: Stocks are only included after a customizable delay period from their first trading date
# From our data_loader.py - IPO bias protection through months_delay
for i, symbol in enumerate(symbols):
first_date = universe_df[universe_df["Symbol"] == symbol]["FirstDate"].iloc[0]
delay_end = first_date + pd.offsets.MonthEnd(self.months_delay)
valid_mask[:, i] = dates_pd > delay_end
Versioned historical data: Our HDF5 store maintains proper vintages to reflect real-world information availability
Pipeline validation tools: Built-in checks flag potential lookahead violations during model development
While platforms like Numerai provide pre-processed features to prevent lookahead, they limit you to their feature set. EES gives you the same guarantees while allowing complete flexibility in feature engineering—all with verification tools to validate your pipeline’s temporal integrity.
Application: Alpha from Feature Ranking
As a proof of concept, we’ve implemented a sophisticated stock ranking system using the LambdaMART algorithm, applied to a universe of current and former components of the S&P 500 Index.. The target label is the 1-month information ratio (IR_1m), constructed as:
Where r_i,t+1 is the forward 1-month return of stock i, r_benchmark is the corresponding sector benchmark return, and σ is the tracking error.
Using the model’s predicted rank scores, we form decile portfolios rebalanced monthly over a 25-year period (2000-2025), with an average turnover of 66% per month.
The top decile (Decile 10) portfolio demonstrates a Sharpe Ratio of approximately 0.8 with an annualized return of 17.8%—impressive performance that validates our approach. As shown in the cumulative return chart, performance remained consistent across different market regimes, including the 2008 financial crisis, the 2020 pandemic crash, and subsequent recovery periods.
Risk-adjusted performance increases across the decile portfolios, indicating that the selected factors appear to provide real explanatory power:
Looking at the feature importance chart, the most significant features include:
Technical features:
Volatility metrics dominate with “Volatility_ZScore” being the most important feature by a wide margin
Our model was trained on data from 1995-1999 and validated on an independent holdout set before final out-of-sample testing from 2000-2025, in which the model is updated every 60 months.
This rigorous approach to validation ensures that our performance metrics reflect realistic expectations rather than in-sample overfitting.
This diverse feature set confirms that durable alpha generation requires the integration of multiple orthogonal signals unified under a common ranking framework—precisely what our Python Equities Entity Store facilitates. The dominance of volatility-related features suggests that risk management is a critical component of the model’s predictive power.
│ └── constructor.py # Create portfolios from rank scores
├── backtest/
│ └── evaluator.py # Sharpe, IR, turnover, hit rate
└── entity/ # Optional metadata (JSON to dataclass)
├── equity.py
├── etf.py
└── index.py
Code Example: Ranking Model Training
Here’s how the ranking model module works, leveraging LightGBM’s LambdaMART implementation:
class RankModel:
def __init__(self, max_depth=4, num_leaves=32, learning_rate=0.1, n_estimators=500,
use_gpu=True, feature_names=None):
self.params = {
"objective": "lambdarank",
"max_depth": max_depth,
"num_leaves": num_leaves,
"learning_rate": learning_rate,
"n_estimators": n_estimators,
"device": "gpu" if use_gpu else "cpu",
"verbose": -1,
"max_position": 50
}
self.model = None
self.feature_names = feature_names if feature_names is not None else []
def train(self, features, labels):
# Reshape features and labels for LambdaMART format
n_months, n_stocks, n_features = features.shape
X = features.reshape(-1, n_features)
y = labels.reshape(-1)
group = [n_stocks] * n_months
train_data = lgb.Dataset(X, label=y, group=group, feature_name=self.feature_names)
self.model = lgb.train(self.params, train_data)
Portfolio Construction
The system seamlessly transitions from predictive scores to portfolio allocation with built-in transaction cost modeling:
# Portfolio construction with transaction cost awareness
def construct_portfolios(self):
n_months, n_stocks = self.pred_scores.shape
for t in range(n_months):
# Get predictions and forward returns
scores = self.pred_scores[t]
returns_t = self.returns[min(t + 1, n_months - 1)]
# Select top and bottom deciles
sorted_idx = np.argsort(scores)
long_idx = sorted_idx[-n_decile:]
short_idx = sorted_idx[:n_decile]
# Calculate transaction costs from portfolio turnover
curr_long_symbols = set(symbols_t[long_idx])
curr_short_symbols = set(symbols_t[short_idx])
long_trades = len(curr_long_symbols.symmetric_difference(self.prev_long_symbols))
short_trades = len(curr_short_symbols.symmetric_difference(self.prev_short_symbols))
tx_cost_long = self.tx_cost * long_trades
tx_cost_short = self.tx_cost * short_trades
# Calculate net returns with costs
long_ret = long_raw - tx_cost_long
short_ret = -short_raw - tx_cost_short - self.loan_cost
Complete Workflow Example
The package is designed for intuitive workflows with minimal boilerplate. Here’s how simple it is to get started:
from equityentity.features import FeatureLoader, LabelGenerator
from equityentity.models import LambdaMARTRanker
from equityentity.portfolio import DecilePortfolioConstructor
# Load features with point-in-time awareness
loader = FeatureLoader(hdf5_path="equity_features.h5")
features = loader.load_features(start_date="2010-01-01", end_date="2025-01-01")
# Generate IR_1m labels
label_gen = LabelGenerator(benchmark='sector_returns')
labels = label_gen.create_information_ratio(forward_period='1M')
# Train a ranking model
ranker = LambdaMARTRanker(n_estimators=500, learning_rate=0.05)
ranker.fit(features, labels)
# Create portfolios from predictions
constructor = DecilePortfolioConstructor(rebalance_freq='M')
portfolios = constructor.create_from_scores(ranker.predict(features))
# Evaluate performance
performance = portfolios['decile_10'].evaluate()
print(f"Sharpe Ratio: {performance['sharpe_ratio']:.2f}")
print(f"Information Ratio: {performance['information_ratio']:.2f}")
print(f"Annualized Return: {performance['annualized_return']*100:.1f}%")
The package supports both configuration file-based workflows for production use and interactive Jupyter notebook exploration. Output formats include pandas DataFrames, JSON for web applications, and HDF5 for efficient storage of results.
Why Start with Cross-Sectional ML?
While Mathematica’s EntityStore emphasized symbolic navigation and knowledge representation, Python excels at algorithmic learning and numerical computation at scale. Beginning with the HDF5 dataset enables immediate application by quantitative researchers, ML specialists, and strategy developers interested in:
Exploring sophisticated feature engineering across time horizons and market sectors
Building powerful predictive ranking models with state-of-the-art ML techniques
Constructing long-short portfolios with dynamic scoring mechanisms
Developing robust factor models and alpha signals
And because we’ve already created metadata-rich JSON files for each entity, we can progressively integrate the symbolic structure—creating a hybrid system where machine learning capabilities complement knowledge representation.
Increasingly, quantitative researchers are integrating tools like LangChain, GPT-based agents, and autonomous research pipelines to automate idea generation, feature testing, and code execution. The structured design of the Python Equities Entity Store—with its modularity, metadata integration, and time-consistent features—makes it ideally suited for use as a foundation in LLM-driven quantitative workflows.
Competitive Pricing and Value
While alternative platforms in this space typically come with significant cost barriers, we’ve positioned the Python Equities Entity Store to be accessible to firms of all sizes:
While open-source platforms like QuantConnect, Zipline, and Backtrader provide accessible backtesting environments, they often lack the scale, granularity, and point-in-time feature control required for advanced cross-sectional ML strategies. The Python Equities Entity Store fills this gap—offering industrial-strength data infrastructure, lookahead protection, and extensibility without the steep cost of commercial platforms.
Unlike these competitors that often require multiple subscriptions to achieve similar functionality, Python Equities Entity Store provides an integrated solution at a fraction of the cost. This pricing strategy reflects our commitment to democratizing access to institutional-grade quantitative tools.
Next Steps
We’re excited to announce our roadmap for the Python Equities Entity Store:
July 2025 Release: The official launch of our HDF5-compatible package, complete with:
Comprehensive documentation and API reference
Jupyter notebooks demonstrating key workflows from data loading to portfolio construction
Example strategies showcasing the system’s capabilities across different market regimes
Performance benchmarks and baseline models with full backtest history
Python package available via PyPI (pip install equityentity)
Docker container with pre-loaded example datasets
Q3 2025: Integration of the symbolic entity framework, allowing seamless navigation between quantitative features and qualitative metadata
Q4 2025: Extension to additional asset classes and alternative data sources, expanding the system’s analytical scope
Early 2026: Launch of a cloud-based computational environment for collaboration and strategy sharing
Accessing the Python Equities Entity Store
As a special promotion, existing users of the current Mathematica Equities Entity Store Enterprise Edition will be given free access to the Python version on launch.
So, if you sign up now for the Enterprise Edition you will receive access to both the existing Mathematica version and the new Python version as soon as it is released.
After the launch of the Python Equities Entity Store, each product will be charged individually. So this limited time offer represents a 50% discount.
By prioritizing scalable feature datasets and sophisticated ranking models, the Python version of the Equities Entity Store positions itself as an indispensable tool for modern equity research. It bridges the gap between raw data and actionable insights, combining the power of machine learning with the structure of knowledge representation.
The Python Equities Entity Store represents a significant step forward in quantitative finance tooling—enabling faster iteration, more robust models, and ultimately, better investment decisions.
To provide the best experiences, we use technologies like cookies to store and/or access device information. Consenting to these technologies will allow us to process data such as browsing behavior or unique IDs on this site. Not consenting or withdrawing consent, may adversely affect certain features and functions.
Functional
Always active
The technical storage or access is strictly necessary for the legitimate purpose of enabling the use of a specific service explicitly requested by the subscriber or user, or for the sole purpose of carrying out the transmission of a communication over an electronic communications network.
Preferences
The technical storage or access is necessary for the legitimate purpose of storing preferences that are not requested by the subscriber or user.
Statistics
The technical storage or access that is used exclusively for statistical purposes.The technical storage or access that is used exclusively for anonymous statistical purposes. Without a subpoena, voluntary compliance on the part of your Internet Service Provider, or additional records from a third party, information stored or retrieved for this purpose alone cannot usually be used to identify you.
Marketing
The technical storage or access is required to create user profiles to send advertising, or to track the user on a website or across several websites for similar marketing purposes.