Author: Denis Avetisyan
Researchers have developed a foundation model that leverages readily available WiFi signals to understand and interpret surrounding environments, paving the way for smarter, more responsive ambient systems.

AM-FM is the first foundation model for ambient intelligence built on WiFi sensing, utilizing self-supervised learning and channel state information to achieve strong performance across a range of tasks.
Despite the ubiquity of WiFi infrastructure, realizing its full potential for nuanced ambient intelligence – understanding human activity and physiology within physical spaces – has been hampered by the limitations of task-specific sensing models. This paper introduces ‘AM-FM: A Foundation Model for Ambient Intelligence Through WiFi’, which addresses this challenge with the first foundation model trained on a massive dataset of unlabeled WiFi Channel State Information (CSI). By leveraging contrastive learning and physics-informed objectives, AM-FM learns robust, generalizable representations that significantly improve performance across nine downstream tasks with enhanced data efficiency. Could this approach unlock scalable, privacy-preserving ambient intelligence directly from existing wireless infrastructure, fundamentally reshaping smart environments and human-computer interaction?
Beyond Conventional Sensing: Towards True Environmental Awareness
Conventional methods of environmental monitoring, reliant on cameras, microphones, and dedicated sensors, often create friction between data collection and individual privacy. These systems, while capable of precise measurements, require direct observation or physical contact, raising concerns about surveillance and data security. Furthermore, these technologies typically offer a fragmented understanding of spaces; a motion detector reveals when someone is present, but provides little insight into what they are doing or the overall environmental context. Limited by their singular focus and placement, traditional sensors struggle to create a holistic picture of activity, requiring extensive and potentially intrusive networks to achieve comprehensive awareness. This reliance on direct measurement and localized data points presents a significant challenge for applications requiring broad environmental understanding without compromising personal boundaries.
Ambient perception represents a paradigm shift in environmental monitoring, moving beyond the limitations of traditional sensor networks. This innovative approach capitalizes on the pervasive nature of WiFi infrastructure, utilizing readily available signals to create a detailed understanding of surrounding spaces and their occupants – without requiring the deployment of additional hardware or raising significant privacy concerns. Unlike cameras or microphones, WiFi signals can penetrate walls and operate discreetly, offering a comprehensive and non-intrusive method for tracking movement, identifying activity, and even estimating the number of people present. The technology essentially transforms existing WiFi networks into distributed sensing platforms, offering a cost-effective and scalable solution for applications ranging from smart homes and office automation to elderly care and public safety.
Deciphering the subtle nuances within raw WiFi signals to glean insights about a physical environment demands sophisticated computational techniques. The inherent complexity arises from the fact that WiFi was not designed for environmental sensing; signals are prone to multipath fading, interference, and noise, obscuring the information related to occupancy, activity, or even object location. Consequently, researchers employ advanced signal processing algorithms to denoise and isolate relevant signal characteristics, such as received signal strength indication (RSSI) and channel state information (CSI). Further complicating matters is the need for robust machine learning models – often deep neural networks – capable of learning the complex relationship between these signal features and the actual environmental conditions. These models require substantial training data, careful feature engineering, and ongoing recalibration to maintain accuracy and adapt to dynamic changes within the monitored space. The pursuit of reliable ambient perception, therefore, hinges on overcoming these considerable challenges in both signal processing and machine learning.
AM-FM: A Foundation for Self-Supervised WiFi Sensing
AM-FM leverages self-supervised learning to extract meaningful features directly from raw WiFi Channel State Information (CSI) without requiring labeled datasets. This approach involves training the model to predict aspects of the input data itself, such as masked portions of the CSI or future signal states, thereby forcing it to learn inherent patterns and relationships within the WiFi signal. The resulting learned representations, or embeddings, capture complex characteristics of the wireless environment and any objects or activities within it, and can then be transferred to downstream tasks like activity recognition or device localization with minimal task-specific training data. This contrasts with traditional supervised learning methods which require extensive labeled data for each specific application.
AM-FM leverages multiple self-supervised learning techniques to extract meaningful patterns from raw WiFi Channel State Information (CSI). Masked Reconstruction involves randomly masking portions of the CSI data and training the model to predict the missing values, forcing it to learn dependencies within the signal. Autocorrelation Prediction tasks the model with forecasting future CSI values based on past observations, effectively capturing temporal dynamics. Finally, Contrastive Learning is employed to learn robust representations by bringing similar CSI samples closer together in the embedding space while pushing dissimilar samples apart, enhancing the model’s ability to differentiate between various environmental conditions and activities. These combined techniques allow AM-FM to effectively model both the temporal evolution and spatial characteristics of the WiFi signal without requiring labeled data.
AM-FM addresses the challenges of variable WiFi signal characteristics through Relative Temporal Encoding and Adaptive Frequency Aggregation. Relative Temporal Encoding processes sequential Channel State Information (CSI) data by representing time as relative offsets rather than absolute timestamps, improving robustness to timing drifts and variations in sampling rates. Simultaneously, Adaptive Frequency Aggregation dynamically adjusts the weighting of different frequency subcarriers within the WiFi signal based on their signal strength and reliability; this prioritizes informative frequencies and mitigates the impact of noise or interference, thereby enhancing the model’s performance in diverse and dynamic real-world WiFi environments.
![Self-supervised pretraining effectively learns discriminative feature representations for fall detection, as evidenced by the clear separation of fall and non-fall classes in a [latex]t[/latex]-SNE visualization, a significant improvement over random initialization which results in overlapping clusters.](https://arxiv.org/html/2602.11200v1/x9.png)
Empirical Validation: AM-FM’s Robustness Across Diverse Applications
The AM-FM model exhibits robust performance across multiple applications despite requiring limited task-specific labeled data, demonstrating significant data efficiency and transfer learning capabilities. This characteristic reduces the need for extensive, costly data annotation typically associated with training machine learning models for new tasks. Performance metrics indicate strong generalization; the model achieves high accuracy with relatively small datasets, suggesting its pre-trained feature representations are broadly applicable and readily adaptable to various domains including, but not limited to, activity recognition and localization. This inherent transferability minimizes the computational resources and time required for deployment in novel applications.
Bottleneck Adaptation and Temporal Probing serve as the primary fine-tuning methodologies for the AM-FM model when applied to downstream tasks. Bottleneck Adaptation involves adding a trainable bottleneck layer to the pre-trained model, allowing for task-specific feature extraction with limited labeled data. Temporal Probing then leverages this adapted model to analyze temporal sequences, crucial for applications like Human Activity Recognition (HAR), Fall Detection, and Gesture Recognition. For Localization tasks, these techniques enable the model to process and interpret sequential data to determine spatial positioning. This combined approach minimizes the need for extensive task-specific datasets while maximizing performance across diverse applications.
Evaluations demonstrate the AM-FM model achieves an Area Under the Receiver Operating Characteristic curve (AUROC) of 0.995 on Localization tasks and 0.923 on Human Activity Recognition. Beyond these core applications, the model exhibits functionality in WiFi Imaging and Proximity Estimation. These capabilities position AM-FM for deployment in diverse environments, specifically smart home automation and healthcare monitoring systems where accurate spatial awareness and activity classification are critical.

Towards a Future of Ubiquitous and Adaptive Ambient Intelligence
The advent of Attention-based Mixture-of-Experts and Feature-mixing (AM-FM) represents a significant stride towards realizing truly ubiquitous ambient intelligence. This innovative architecture allows systems to dynamically adjust to fluctuating environmental conditions and evolving user requirements, moving beyond static, pre-programmed responses. By intelligently allocating computational resources – focusing on the most relevant information within a complex data stream – AM-FM enables devices to operate efficiently and effectively in unpredictable real-world scenarios. This adaptability is crucial for creating genuinely seamless integration of technology into daily life, powering applications that anticipate needs and respond proactively, ultimately fostering a more intuitive and responsive technological ecosystem.
A significant advancement in ambient intelligence lies in the development of models exhibiting robust cross-task transfer learning, drastically reducing reliance on vast quantities of labeled data. Recent studies demonstrate substantial performance gains in low-shot learning scenarios – specifically when utilizing only K={5, 10, 25, 50} labeled examples – across a diverse range of tasks. This heightened data efficiency represents a crucial step towards deploying adaptive systems in real-world environments where acquiring and annotating extensive datasets is often impractical or cost-prohibitive. By effectively leveraging knowledge gained from previously learned tasks, the model minimizes the need for task-specific training, enabling quicker adaptation and broader applicability in dynamic and unpredictable settings.
The advent of adaptable ambient intelligence promises a transformative impact across multiple sectors, extending beyond simple automation to genuinely responsive environments. In healthcare, the technology facilitates proactive monitoring of patient well-being, potentially detecting subtle changes indicative of emerging conditions before they become acute. Simultaneously, building management systems can leverage this intelligence to optimize energy consumption, dynamically adjusting heating, cooling, and lighting based on occupancy and environmental conditions. Furthermore, enhanced security protocols become feasible, with systems capable of identifying anomalous activity and responding in real-time, moving beyond passive surveillance to actively protect assets and individuals. These converging applications highlight the potential for a future where environments intelligently anticipate and adapt to human needs, fostering greater safety, efficiency, and quality of life.

The development of AM-FM, as detailed in the paper, exemplifies a commitment to establishing provable foundations for ambient intelligence. The model’s reliance on self-supervised learning from extensive CSI data isn’t merely about achieving empirical success; it’s a pursuit of inherent, mathematical structure within the wireless environment. This echoes Donald Davies’ observation that, “The only way to achieve a really complex system is to build it out of simple parts.” AM-FM distills the complexity of real-world activity into quantifiable time-series data, revealing underlying patterns through rigorous analysis – a testament to building complex intelligence from fundamentally simple, mathematically defined components. The efficacy of AM-FM across diverse tasks isn’t simply demonstrated; it is derived from the correctness of its underlying representation learning.
What Remains Constant?
The presentation of AM-FM represents a pragmatic step – a leveraging of scale to mask underlying theoretical deficits. Let N approach infinity – what remains invariant? The signal, fundamentally, is not about activity, but about propagation – reflections, diffractions, and the chaotic interplay of electromagnetic waves. This work demonstrates the utility of representation learning, yet sidesteps the crucial question of what is being represented. Is the model truly capturing semantic meaning, or merely memorizing complex patterns in a high-dimensional space? The efficacy across diverse tasks is encouraging, but diversity of task does not equate to generality of understanding.
Future work must address the limitations inherent in relying solely on channel state information. The model’s performance is inextricably linked to the specific WiFi hardware and environment used for training. True ambient intelligence demands robustness to variation – a capacity to generalize beyond the particulars of the data upon which it was built. Furthermore, the energy cost of continuously sensing and processing CSI data remains a significant obstacle. A truly elegant solution will require a distillation of information – a parsimonious representation that captures the essence of the environment with minimal computational overhead.
The pursuit of “foundation models” in this domain risks conflating correlation with causation. The challenge is not simply to build a system that can recognize activities, but to build a system that understands the underlying physics – a system grounded in first principles, not merely empirical observation. Until that is achieved, the promise of truly intelligent ambient environments will remain, at best, a sophisticated illusion.
Original article: https://arxiv.org/pdf/2602.11200.pdf
Contact the author: https://www.linkedin.com/in/avetisyan/
See also:
- MLBB x KOF Encore 2026: List of bingo patterns
- Honkai: Star Rail Version 4.0 Phase One Character Banners: Who should you pull
- Married At First Sight’s worst-kept secret revealed! Brook Crompton exposed as bride at centre of explosive ex-lover scandal and pregnancy bombshell
- Top 10 Super Bowl Commercials of 2026: Ranked and Reviewed
- Gold Rate Forecast
- Lana Del Rey and swamp-guide husband Jeremy Dufrene are mobbed by fans as they leave their New York hotel after Fashion Week appearance
- ‘Reacher’s Pile of Source Material Presents a Strange Problem
- eFootball 2026 Starter Set Gabriel Batistuta pack review
- Meme Coins Drama: February Week 2 You Won’t Believe
- Olivia Dean says ‘I’m a granddaughter of an immigrant’ and insists ‘those people deserve to be celebrated’ in impassioned speech as she wins Best New Artist at 2026 Grammys
2026-02-14 13:10