Author: Jack Danahy

It’s time to find common ground and a common lexicon to simplify security operations and decision-making.

The cybersecurity market is probably one of the most innovative and fractured in all of technology. Unlike those in storage, processing, or SaaS, the evolution of new approaches in cybersecurity is triggered by new threats, and seldom by innovation or a shared understanding of emerging customer need. As a result, each year brings with it multiple new entrants, new approaches, and increasing complexity that stymies even the most practiced cybersecurity professional.

We see the sprawl and confusion created by this expansion daily. You may not know, but suspect, that there are currently over 3,500 cybersecurity vendors in the U.S. Each speaks its own security language and experts must understand whatever dialect is used by the tools they have in hand. This makes monitoring, analytics, response, and team education exponentially more difficult with every new telemetry source.

As cybersecurity professionals, we work to improve the efficacy and understanding of cybersecurity, both for our investors and cohort companies in Almanna Cyber and for our managed security clients at NuHarbor Security. That’s why we’re sponsoring and leading our first joint hackathon, the Cybersecurity Polyglot Challenge. The successful Polyglot team will develop a means of translating data from this highly heterogeneous mix into a single representation so that security management tools can either request or automatically ingest consistently formatted information derived from multiple sources.

What will the Polyglot Project accomplish?

Whether an organization is looking to understand if they are under attack, or if a ransomware campaign is spreading, they need to ingest information provided by multiple vendors and technologies. While the format and organization of telemetry from different providers will vary, the core elements are the same. An endpoint security product will send alerts and messages with an address, a machine type, a timestamp, an alert priority, a related user, and more, but every vendor will organize and present this information differently. It mimics a spoken or written language, where the structure or object/action/description is common. The Polyglot Project will create a translator, pulling out the common and most important elements of security device telemetry data to support a data abstraction layer that will generate platform-agnostic representations for analytics, reporting, and alerting.

In our challenge, we’re looking for a new approach to reducing complexity through the use of a common interpreter that can normalize event language across multiple vendor platforms. Specifically, we challenge Polyglot teams to create a single higher-level abstraction of event language that will allow an analyst to recognize malicious behavior across CrowdStrike, Microsoft Defender, and other endpoint solutions using a single set of queries.

What are likely approaches?

The subject technologies will detect malicious events on protected endpoints but will report on them in a different format. We’re looking for a translating function, an interpreter, capable of consuming logs and events from these differing technologies to produce an output that’s normalized into a consistent format and can be queried by an analyst who won’t need to understand or even recognize the technology providing information.

We envision the solution to this problem as one leveraging a new form of natural language processing (NLP). This may include frameworks, libraries, or packages designed to learn and understand the meaning of an ambiguous language based on context. Example solutions might utilize an encoder-decoder strategy, a graph attention network (GAT), or other graph-based deep learning models. Ultimately, we’re agnostic in terms of specific technical solutions. Scoring criteria is focused on accuracy of the models, their performance, quality of the code, and each team’s future recommendations.

The approach will need to perform at scale, as these are high-volume messages that must be processed with machine-speed performance. While we expect this to be much like an NLP solution, there’s a very limited syntax (i.e., the event/message format) and set of output items (i.e., the normalized events).

Why are Almanna and NuHarbor sponsoring the Polyglot Project?

Almanna Cyber is a cybersecurity-focused venture fund and startup accelerator. The developing Almanna cohort of companies will benefit from a common ontology, and the creation of a company to build a product strength version of Polyglot is an attractive investment opportunity.

NuHarbor Security manages cybersecurity for hundreds of organizations, many of them in the public sector, who rely on a heterogeneous and growing set of underlying security technologies. As a result, NuHarbor analysts and engineers are very familiar with the vagaries of these tools and the complexities they introduce for users. Polyglot will supply a unifying framework for monitoring and response tooling in the hands of these experts.

This novel approach is important enough that the winning entry will be awarded a $10,000 (USD) prize, and five notable, non-winning entries will be eligible for $1,000 (USD) prizes.

If you’re interested in participating, or know others who may be, full project and registration details are available at The Cybersecurity Polyglot Project. Registration is open until February 15, 2023, and final submissions are due on March 31, 2023.

Come help us make cybersecurity easier with your ideas, innovation, and experience.