Exploring the Future of Workplace Metrics With Machine Learning
October 18, 2022 | By Dr. Mike O'Neill
The ascendency of people-centric workplace design has changed how we evaluate workplace success. In a previous blog, we outlined how changes in workplace design, novel sources of data, and the move to predictive analytics will transform how we use data in workplace research. We suggest that the metrics of design success should include workforce and business outcomes. An abundance of new types of data is expanding the opportunity to understand relationships between workplace, building, and even neighborhood characteristics on these outcomes.
In this blog, we discuss how emerging data platforms and data science tools (such as Machine Learning), when guided by theory and proper research design, can reveal new insights about how to make effective workplaces.
New data science tools and platforms could reveal novel insights
The complexity of creating statistical models that incorporate widely varied sources of data (as diverse as websites or online public databases, building control systems, and sensors) remains a challenge, especially using traditional social science research methods and analyses. The analytic tools of data science, such as Machine Learning (ML), offer an opportunity to complement and extend the capabilities of existing social science methods. We define ML as a collection of technology components that collect, process, find patterns, or interpret data using flexible algorithms and statistical techniques. The term “deep learning” is typically used to describe artificial neural network (ANN) models. These can be either prediction or classification models. For instance, Netflix has a “recommender” tool that uses a prediction model, analyzing your viewing history to suggest movies you may enjoy watching. Most spam filters use a classification model to identify emails that are “spam” or “not spam.”
It is not just that ML tools can process astonishingly large quantities of data, but that forms of data previously unusable by social science researchers are now accessible. This data can include images, sound, speech, video, free text, sensor data, social media posts, geographic information data (GIS) with detailed population information, public health databases, and other robust sources. ML applications tend to require (and use) large amounts of potentially disparate types of data. These make “cleaning” and transforming the data for use by a deep-learning model a critical (not to mention time-consuming) task. It also requires a solid grounding in statistics to ensure the underlying distributions of data, and any transformations to the raw data used by the model are suitable. While true of any statistical analysis, the quality of input data are so integral to the functioning of ML models that, without them, AI has been described as “mathematical fiction.”
An emerging class of modular data hosting platforms are making it possible to seamlessly connect and analyze disparate data types and large volumes of data to create new insights. These platforms allow access to traditional social science statistics, but also offer ML tools. These platforms, coupled with ML tools, could fundamentally change our ability to understand the relations between workplace design, behavioral and social phenomena, and business outcomes. For example, the ability to easily access large streams of air quality, noise level, and illumination data, coupled with space utilization and workplace survey data, could yield new insights about employee turnover.
Social science uses deductive approach; data science is often inductive
There is a fundamental difference in the purpose of data science and social science projects. Social science typically (but not exclusively) uses a “deductive” approach in which hypotheses about an existing theory are tested by collecting and analyzing data. Deductive reasoning is guided by theory; data is collected to test that theory. The intention is to identify factors that explain the outcome.
Data science projects often use an inductive approach with no guiding theory or hypothesis. Observations from the data are used to develop a theory that could explain those patterns. Data science models are most often found in business applications in which the accuracy of the model is important (for instance, accurately predicting consumer choice). Explaining the factors underlying the choice (i.e., testing a theory of consumer choice behavior) is not the intention.
Static versus adaptable models
Social science researchers build “hand-crafted” models that serve as proof statements of a theory. They use all available data to build the model. The parameters of these models are static. They are built using historical data, used once, almost never revalidated with additional data, and published.
In contrast, data scientists train, test, and validate their models using portions of their dataset, and the quality of these models can be improved at any time by “training” with new data added as part of an iterative discovery process. ML tools can optimize the internal parameters of these models to improve their quality.
Exploring the frontiers of workplace research with machine learning
As noted earlier, while ML models can be extremely good at predicting an outcome correctly, they are a “black box” because they are not designed to identify or validate the underlying causal factors related to an outcome. Thus, social scientists have been less likely to leverage these tools. However, there are new ML methods based on game theory using technologies that can tell us how much each factor in a model has contributed to the prediction. These techniques can help to identify and visualize important relationships in the model, explaining its inner workings and rationale — adding a robust explanation to its prediction. These techniques are not the same as proving causality between the factors and outcomes, nor generalizing to a broader theory, but these tools improve the utility of these models, nonetheless.
The real potential is for social science researchers with a background in data science to apply the rigor of social science methods to access the power of ML tools to handle the trove of new data that could lead to breakthroughs in our understanding of how workplace affects people. ML, coupled with hypotheses grounded in established behavioral theory, use of field or experimental research design and data collection, will undoubtably lead to new insights. In contrast to the traditional tasks for machine learning in computer science and statistics, when machine learning is applied to social scientific data, it could be used to identify important outcome variables and make predictions. A powerful differentiator for ML statistical models is that they can learn and improve over time by being trained with new data — and even potentially change what they predict or classify.
The era of large sources of previously unavailable data creates an opportunity for workplace researchers to expand their toolkit. Machine learning methods are ideally suited to help researchers use the abundance of data to its best capacity. Machine learning tools are just tools, though, and not a magic wand absolving the researcher of following research best practices (e.g., proper study design, hypotheses, and testing assumptions of normality, linearity, and equality of variance in the data).
In the social sciences, ML will perform best when it is applied appropriately to research problems. Researchers need to think creatively about the statistical task and consider if a machine learning application could be advantageous. Perhaps the most important consideration is that the staggering amount of data now available and the relative ease of using analytic tools has made theory — and the specific models that may underlie a theory — critical to conducting quality research. Theory and models provide an important roadmap on how to design a study, formulate research hypotheses and questions, select the right analytic tools, and a context for how to interpret results (Ashworth et al. 2015, Slough 2019, de Marchi & Stewart 2020).
Next steps in our journey with machine learning tools
Workplace research is entering an era of “an embarrassment of riches” and machine learning tools represent an opportunity to extract meaning from previously unusable types of data. The addition of ML in workplace research is no panacea; more than ever it requires us to carefully apply the right tools to specific analytic needs and develop best practices. When machine learning is applied to workplace research, it can be used to discover new concepts, reveal patterns and associations in data, and make predictions. ML will add a new dimension to the deductive, “single point in time” approach of social science with a more interactive, and iterative approach to research. Finally, the inductive and sometimes qualitative outcomes of ML (such as revealing associated data patterns), may offer useful perspectives that may lend themselves to explaining the qualitative nature of workplace design and user experience.
A cross-functional team lead by the Gensler Research Institute (GRI) is using the tools of data science and new sources of city, neighborhood, building, interior design, demographic, and survey data to investigate the long-term potential of this approach. Other GRI funded internal projects are also using ML to leverage our existing datasets to reveal insights about user experience.
For media inquiries, email .