{"id":55,"date":"2020-11-01T12:26:02","date_gmt":"2020-11-01T12:26:02","guid":{"rendered":"http:\/\/infosecml.com\/?p=55"},"modified":"2022-04-07T14:57:13","modified_gmt":"2022-04-07T14:57:13","slug":"so-you-want-to-build-your-own-deep-learning-solution","status":"publish","type":"post","link":"https:\/\/infosecml.com\/index.php\/2020\/11\/01\/so-you-want-to-build-your-own-deep-learning-solution\/","title":{"rendered":"Build your own Deep Learning UEBA system?"},"content":{"rendered":"\n<p class=\"wp-block-paragraph\">At the time of writing this article I lead the Solutions Architecture team at Exabeam, a UEBA based SIEM company. As such,  I tended to get into some pretty interesting conversations with customers about both security monitoring and data science. My favourite conversation by subject is definitely when customers tell me they want to build their own analytics platform. Perhaps they have read a paper on the subject or they have just hired a data science team into security or something else makes them want to explore the technology at a deeper level than just using what we provide out of the box.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">If the \u201cI want to do data science for myself\u201d is my favourite conversation then the \u201cI want to do Deep Learning\u201d is a special treat. It is my own area of research and whilst we don\u2019t do deep learning in our platform there is no reason why we shouldn\u2019t make deep learning for security analytics a reality for those customers who are ready to take that step. If you are that customer \u2013 or perhaps you aren\u2019t even a customer yet, buckle in and I\u2019ll talk you through the dos and don\u2019ts of deep learning for security monitoring.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Often the first conversation on this subject I have is when a customer, new to the subject, starts challenging me on some aspect of the Exabeam platform. Normally I\u2019m being told that I am missing some trick, or just plain doing it wrong. Often the customer intends to prove this though some grand plan of their own. These grand plans range from \u201cDeep learning will just automagically find the bad guys\u201d through \u201cI have just learnt about Jupiter notebooks and want to find the bad guys with that\u201d to \u201cWe built a data lake five years ago and we now plan to start analysing that data\u201d; and my response to all of these is \u201cLet me stop you right there\u201d\u2026<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">All of these approaches make the same set of mistakes, but often at different levels. These mistakes often fall into the \u201cdon\u2019t start with raw data\u201d, category. What this means is that data science, not matter what area you work in, has a data pipeline before you get to the point where you are doing actual analytics \u2013 long before you can do analytics in most cases. And the truth of the matter is that raw data tends to be horrible. I mean really bad \u2013 useless in most cases. If you look at a typical SIEM vendor, the first two stages in their pipeline are data collection and data parsing. Well they are also the first two stages a data science team need to consider. They need to collect the right data in a consistent manner, ideally without too many gaps and they need to isolate the data features \u2013 to use the data science parlance, or to extract the fields as we InfoSec people like to think of it.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">For this reason I tend to suggest that budding security analytics teams start from their existing SIEM and pull data already parsed and normalised. what do I mean by normalised? Well if you want to analyse logins then you probably want to do that regardless of technology being logged into. So a login on a Cisco router needs to be analysable in the same way as a login on a linux server or a windows workstation. Normalisation is changing the data so that features or fields are made the same based upon equivalence of action. You SIEM will do all of that for you (or it should). So save yourself a lot of hassle and take post ingested data. If you built your own data lake some time ago my bet is that the normalisation layer (and possibly even the parsing layer) got missed out. If not then you are one of the lucky few. Data lakes are often built on the schemaless (or schema on read) principle. That makes it easy to get data into them but defers the work of normalisation and feature extraction to later in the pipeline. One way or another that work needs to get done.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">So now you have all of your data you need to have a theory about what it can show you. Many new to the field of Machine Learning believe that they can just throw all of the data at a Machine Learning program and it will spit out revelations and insights. Unfortunately that is not the case. You need to develop a hypothesis and then you need to develop a model for analysing that hypothesis. Lastly you need to evaluate the success of the model so you know if it is worth deploying it in production, i.e. giving insights to the over stretched SOC team to use on a daily basis. Trust me, giving them a whole lot more false positives to look at every day isn\u2019t going to make you popular. For more information on the approach I suggest you read: <a href=\"https:\/\/en.wikipedia.org\/wiki\/Scientific_method\">The Scientific Method<\/a><\/p>\n\n\n\n<p class=\"wp-block-paragraph\">There are any number of things you might choose to analyse but here is one I tried, and yes, it uses deep learning. The Exabeam Advanced Analytics system identifies anomalies and attaches risk to each one based upon a wide range of criteria including context associated with the person or machine. Risk accumulates for a user (or machine) but it also decays. Exabeam can provide a list of notable users who\u2019s risk has risen above a threshold. This works very well in practice and we have seen countless breaches halted in their tracks with this system. My hypothesis was that looking over a longer time period, users would generally all generate some risk as anomalies are simply that, some atypical behaviour. Corporate users are always accessing new systems, being put into new groups or forgetting their passwords. However, I postulated that these normal anomalies might form a pattern for each group of users based upon their normal role and these patterns would be different to the pattern of anomalies generated by a bad actor; someone seeking to compromise systems would generate many of the same anomalies as a normal user but they might also not generate some of those more typical anomalies. In short I wanted to look at anomalous anomalies.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">This is an excellent example of using data from as late in the processing pipeline as possible. I planned to use the list of users who generate risk with which anomalies contribute to that risk as my input but I deliberately ignored the level of risk for each anomaly as irrelevant. I needed to generate a list of candidate notable users as my output to replace the list generated simply by adding risk scores. I have no tagged data to work with as I had to assume that for any given organisation there might be good and bad users already present. I therefore knew I needed an unsupervised learning approach \u2013 one which would find patterns in my data without the need to have data already tagged as good behaviour or bad behaviour on which first to train. This is one of the biggest problems in Machine Learning for Information Security. There just isn\u2019t much bad behaviour tagged data around and where there is, it tends to be specific to a single organisation or technology. Supervised learning techniques tend to be a little niche for Information Security use cases as a result.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">After some investigation I decided to try using a Deep Learning technique called Self Organising Maps or SOMs. I\u2019m not going to explain SOMs to you because there are many excellent article which do that very well. In short though the data I had was has high dimensional. That is to say that each anomaly in my data would form a dimension (also called a feature) and each user would record their anomalies as a row in this data. So my first run against real data over a month gave me 9,180 users (rows) with 115 different anomalies triggered. I had over nine thousand rows of data and 115 dimensions (columns of features). The great thing about a SOM is that it will reduce those dimensions to three effectively. Something much easier to work with.<\/p>\n\n\n\n<figure class=\"wp-block-image size-large\"><img loading=\"lazy\" decoding=\"async\" width=\"1024\" height=\"411\" src=\"https:\/\/infosecml.com\/wp-content\/uploads\/2020\/11\/Screenshot-2020-11-01-at-12.21.37-1024x411.png\" alt=\"\" class=\"wp-image-56\" srcset=\"https:\/\/infosecml.com\/wp-content\/uploads\/2020\/11\/Screenshot-2020-11-01-at-12.21.37-1024x411.png 1024w, https:\/\/infosecml.com\/wp-content\/uploads\/2020\/11\/Screenshot-2020-11-01-at-12.21.37-300x120.png 300w, https:\/\/infosecml.com\/wp-content\/uploads\/2020\/11\/Screenshot-2020-11-01-at-12.21.37-768x308.png 768w, https:\/\/infosecml.com\/wp-content\/uploads\/2020\/11\/Screenshot-2020-11-01-at-12.21.37-1536x616.png 1536w, https:\/\/infosecml.com\/wp-content\/uploads\/2020\/11\/Screenshot-2020-11-01-at-12.21.37.png 1884w\" sizes=\"auto, (max-width: 1024px) 100vw, 1024px\" \/><\/figure>\n\n\n\n<p class=\"wp-block-paragraph\">I used python to build my program. If you don\u2019t know Python then Machine Learning probably isn\u2019t for you. Python has become the default standard for ML over the last few years and there are lots of packages available to make your life easier. This is where I have an advantage as I pulled the data directly from Mongo \u2013 I have access to parts of the system that normal users don\u2019t but I could have used the Exabeam APIs to do this with just a bit more work. I then built a <em>pandas<\/em> data frame containing this data and used an open source implementation of Self Organising Maps called MiniSOM (<a href=\"https:\/\/pypi.org\/project\/MiniSom\/\">https:\/\/pypi.org\/project\/MiniSom\/<\/a>) to analyse it. MinSOM basically makes the data gravitate together or clump based upon its similarity. The idea is to form clusters of similar behaviour and then to look at the <strong>outliers<\/strong> within a two dimensional grid. I needed good clumping so I needed a lot of iterations of the training for the SOM model and after playing around with the size of the grip used to range the data and the number of iterations, as well as some other hyper parameters (parameters which control the analysis \u2013 not the ones which come from your data) I was able to get some very interesting results. Having confirmed that everything was working as expected I also excluded some low risk but high frequency risk rules as they made it harder to find outliers.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">I decided to add a visualisation of the SOMs I was generating to get a better understanding of how they were training and I have shown a couple here. The dark areas are high density groups where users are forming common patterns and the very light areas are less dense and contain smaller clusters. The secret is to find the least dense part of the diagram for the most anomalous users. Don\u2019t forget there are over 9,000 users in this grid. I also decided that I would run multiple epochs (an epoch is running through your training once, so multiple epochs means running it multiple times with different starting positions). I then looked for users who were outliers in multiple epochs \u2013 These truly were anomalous. The reason for doing this is the map starts with random values and it is possible for a clump to form close to a real anomaly obscuring it; running multiple epochs reduces this chance and gives significantly better results. Some epochs would simple identify very small groups of normal users so at least ten epochs tended to give me the best chance of finding the real bad actors without also generating this annoying false positives.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Having found the anomalies I went back to the original pandas data frame and identified which user it was. This is because my SOM operates with purely numeric data and anomalies need to be encoded in a categorical fashion with a 1 for triggered and a zero for not triggered for each anomaly type. I could have considered using a sliding scale for how frequently an anomaly was triggered but actually that would have been irrelevant to my hypothesis (stick to the plan \u2013 don\u2019t be tempted to throw in irrelevant data). My final trick was to look at the anomalies they had triggered in the period and map them against MITRE to get a feel for their activity. Clearly if they were all in the same MITRE Tactics area they were less interesting than seeing activity spread across a possible entire kill-chain.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Here are some of my results. Is this better than what Exabeam Advanced Analytics does by default? Who knows, it\u2019s a different view into the user activity and avoids assigning fairly arbitrary risk scores. It also looks for patterns over long periods of time and so can find low and slow attacks. You could find all of this activity by using the threat hunter feature, if you knew what to look for. I set out to see if there were different approaches to fining notable users to investigate which might provide different results as a starting point for an investigation. What this does demonstrate are the thought processes which might lead you to develop your own Deep Learning approach to Information Security. Start with preprocessed data. Have a theory to test; play around with the parameters and the model and finally know if your model is produce something real or just random noise. The jury is still out on that last one. Here are some of the outputs for you to judge:<\/p>\n\n\n\n<pre class=\"wp-block-preformatted\"><strong>Notable Users<\/strong>\n<strong><em>svc-proxy\u00a0 - Risk: 10.0<\/em><\/strong>\nRules\nDC23 : Abnormal session start time\n\u00a0 \u00a0 - T1124 (System Time Discovery)\nWPA-OH-F : First execution of critical windows command using privileged access on this host\n\u00a0 \u00a0 - T1082 (System Information Discovery)\nSEQ-UH-16 : Exceeded number of failed logons for the user\n\u00a0 \u00a0 - T1110 (Brute Force)\n\u00a0 \u00a0 - T1078 (Valid Accounts)\nA-EPA-HP-F : First execution of process on asset\n\u00a0 \u00a0 - T1204 (User Execution)\nEPA-HP-F : First execution of process on host\n\u00a0 \u00a0 - T1204 (User Execution)\nAL-RT : Risk transfer from account lockout activities\nWPA-UH-F : First privileged access event on host for user\n\u00a0 \u00a0 - T1068 (Exploitation for Privilege Escalation)\nRA-UH-F : First access to asset\n\u00a0 \u00a0 - T1078 (Valid Accounts)\nEPA-OP-F : First execution of process in this organization\n\u00a0 \u00a0 - T1204 (User Execution)\nA-EPA-OP-F : First execution of process for the asset in this organization\n\u00a0 \u00a0 - T1204 (User Execution)\nEPA-UP-F : First execution of process for user\n\u00a0 \u00a0 - T1204 (User Execution)\nEPA-USequenceSize-WC : Abnormal number of critical windows command executions by the user\n\u00a0 \u00a0 - T1059 (Command-Line Interface)\nEPA-OH-F : First execution of critical windows command on this host\n\u00a0 \u00a0 - T1059 (Command-Line Interface)\nDC24 : Abnormal day of week\n\u00a0 \u00a0 - T1124 (System Time Discovery)\nTactics Employed\nTA0001 :\u00a0 Initial Access\nTA0002 :\u00a0 Execution\nTA0003 :\u00a0 Persistence\nTA0004 :\u00a0 Privilege Escalation\nTA0005 :\u00a0 Defense Evasion\nTA0006 :\u00a0 Credential Access\nTA0007 :\u00a0 Discovery\n\n\n<strong><em>bracer-admin\u00a0 - Risk: 20.0<\/em><\/strong>\nRules\nAM-OG-F : First member addition to this group for the organization\n\u00a0 \u00a0 - T1098 (Account Manipulation)\nRA-UH-F : First access to asset\n\u00a0 \u00a0 - T1078 (Valid Accounts)\nEPA-UP-F : First execution of process for user\n\u00a0 \u00a0 - T1204 (User Execution)\nA-EPA-HP-F : First execution of process on asset\n\u00a0 \u00a0 - T1204 (User Execution)\nEPA-OP-F : First execution of process in this organization\n\u00a0 \u00a0 - T1204 (User Execution)\nA-EPA-OP-F : First execution of process for the asset in this organization\n\u00a0 \u00a0 - T1204 (User Execution)\nEPA-PU-PS-F : First execution of powershell process for user\n\u00a0 \u00a0 - T1086 (PowerShell)\nAE-UA-F : First activity type for user\n\u00a0 \u00a0 - T1078 (Valid Accounts)\nAM-UA-MA-F : First account group management activity for user\n\u00a0 \u00a0 - T1078 (Valid Accounts)\nWPA-UH-F : First privileged access event on host for user\n\u00a0 \u00a0 - T1068 (Exploitation for Privilege Escalation)\nTactics Employed\nTA0001 :\u00a0 Initial Access\nTA0002 :\u00a0 Execution\nTA0003 :\u00a0 Persistence\nTA0004 :\u00a0 Privilege Escalation\nTA0005 :\u00a0 Defense Evasion\nTA0006 :\u00a0 Credential Access\n\n\n<strong><em>griffinb , Beatrice Griffin - Risk: 0.0\n<\/em><\/strong>Rules\nA-NET-HCountry-Outbound-F : First outbound connection to this country from asset\n\u00a0 \u00a0 - T1071 (Standard Application Layer Protocol)\nPA-NoIT : Badge access without IT presence\n\u00a0 \u00a0 - T1078 (Valid Accounts)\nEPA-HP-F : First execution of process on host\n\u00a0 \u00a0 - T1204 (User Execution)\nA-EPA-HP-F : First execution of process on asset\n\u00a0 \u00a0 - T1204 (User Execution)\nA-EPA-OP-F : First execution of process for the asset in this organization\n\u00a0 \u00a0 - T1204 (User Execution)\nAE-UA-F : First activity type for user\n\u00a0 \u00a0 - T1078 (Valid Accounts)\nAE-UA-F-VPN : First VPN connection for user\n\u00a0 \u00a0 - T1133 (External Remote Services)\nPA-UTi-A : Badge access at abnormal time\n\u00a0 \u00a0 - T1078 (Valid Accounts)\nA-NET-HdPort-Outbound-F : First outbound connection on port for asset\n\u00a0 \u00a0 - T1065 (Uncommonly Used Port)\nTactics Employed\nTA0001 :\u00a0 Initial Access\nTA0002 :\u00a0 Execution\nTA0003 :\u00a0 Persistence\nTA0004 :\u00a0 Privilege Escalation\nTA0005 :\u00a0 Defense Evasion\nTA0011 :\u00a0 Command and Control<\/pre>\n\n\n\n<p class=\"wp-block-paragraph\">I don\u2019t know about you but I think I\u2019d be tempted to take a look at the timelines for those users, though the last one looks like it could just be a user travelling; worth a check though I\u2019d say. That is the last take-away for security analytics. It doesn\u2019t provide definitive answers most of the time. Its real value is that it points you in the right direction. To identify things worthy of your time to investigate and it given you something better to work on than chasing the same set of alerts all the time. Machine learning is just another tool in your kitbag. The best tool is still a skilled security professional looking at a timeline.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>At the time of writing this article I lead the Solutions Architecture team at Exabeam, a UEBA based SIEM company. As such, I tended to get into some pretty interesting conversations with customers about both security monitoring and data science. My favourite conversation by subject is definitely when customers tell me they want to build [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":372,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[13,7],"tags":[5,25,10],"class_list":["post-55","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-deep-learning","category-machine-learning","tag-deep-learning","tag-self-organising-maps","tag-ueba"],"yoast_head":"<!-- This site is optimized with the Yoast SEO plugin v27.7 - https:\/\/yoast.com\/product\/yoast-seo-wordpress\/ -->\n<title>Build your own Deep Learning UEBA system? - InfoSecML<\/title>\n<meta name=\"description\" content=\"Building a real Deep Learning security solution may not be as hard as you think. This article describes a project to identify low and slow attacks based upon anomalous behaviour of the attacker.\" \/>\n<meta name=\"robots\" content=\"index, follow, max-snippet:-1, max-image-preview:large, max-video-preview:-1\" \/>\n<link rel=\"canonical\" href=\"https:\/\/infosecml.com\/index.php\/2020\/11\/01\/so-you-want-to-build-your-own-deep-learning-solution\/\" \/>\n<meta property=\"og:locale\" content=\"en_GB\" \/>\n<meta property=\"og:type\" content=\"article\" \/>\n<meta property=\"og:title\" content=\"Build your own Deep Learning UEBA system? - InfoSecML\" \/>\n<meta property=\"og:description\" content=\"Building a real Deep Learning security solution may not be as hard as you think. This article describes a project to identify low and slow attacks based upon anomalous behaviour of the attacker.\" \/>\n<meta property=\"og:url\" content=\"https:\/\/infosecml.com\/index.php\/2020\/11\/01\/so-you-want-to-build-your-own-deep-learning-solution\/\" \/>\n<meta property=\"og:site_name\" content=\"InfoSecML\" \/>\n<meta property=\"article:published_time\" content=\"2020-11-01T12:26:02+00:00\" \/>\n<meta property=\"article:modified_time\" content=\"2022-04-07T14:57:13+00:00\" \/>\n<meta property=\"og:image\" content=\"https:\/\/infosecml.com\/wp-content\/uploads\/2020\/11\/som.png\" \/>\n\t<meta property=\"og:image:width\" content=\"2870\" \/>\n\t<meta property=\"og:image:height\" content=\"1137\" \/>\n\t<meta property=\"og:image:type\" content=\"image\/png\" \/>\n<meta name=\"author\" content=\"Steve Gailey\" \/>\n<meta name=\"twitter:card\" content=\"summary_large_image\" \/>\n<meta name=\"twitter:label1\" content=\"Written by\" \/>\n\t<meta name=\"twitter:data1\" content=\"Steve Gailey\" \/>\n\t<meta name=\"twitter:label2\" content=\"Estimated reading time\" \/>\n\t<meta name=\"twitter:data2\" content=\"13 minutes\" \/>\n<script type=\"application\/ld+json\" class=\"yoast-schema-graph\">{\"@context\":\"https:\\\/\\\/schema.org\",\"@graph\":[{\"@type\":\"Article\",\"@id\":\"https:\\\/\\\/infosecml.com\\\/index.php\\\/2020\\\/11\\\/01\\\/so-you-want-to-build-your-own-deep-learning-solution\\\/#article\",\"isPartOf\":{\"@id\":\"https:\\\/\\\/infosecml.com\\\/index.php\\\/2020\\\/11\\\/01\\\/so-you-want-to-build-your-own-deep-learning-solution\\\/\"},\"author\":{\"name\":\"Steve Gailey\",\"@id\":\"https:\\\/\\\/infosecml.com\\\/#\\\/schema\\\/person\\\/f11f4ea3133147a580202f13b6da27e8\"},\"headline\":\"Build your own Deep Learning UEBA system?\",\"datePublished\":\"2020-11-01T12:26:02+00:00\",\"dateModified\":\"2022-04-07T14:57:13+00:00\",\"mainEntityOfPage\":{\"@id\":\"https:\\\/\\\/infosecml.com\\\/index.php\\\/2020\\\/11\\\/01\\\/so-you-want-to-build-your-own-deep-learning-solution\\\/\"},\"wordCount\":2284,\"commentCount\":0,\"publisher\":{\"@id\":\"https:\\\/\\\/infosecml.com\\\/#\\\/schema\\\/person\\\/f11f4ea3133147a580202f13b6da27e8\"},\"image\":{\"@id\":\"https:\\\/\\\/infosecml.com\\\/index.php\\\/2020\\\/11\\\/01\\\/so-you-want-to-build-your-own-deep-learning-solution\\\/#primaryimage\"},\"thumbnailUrl\":\"https:\\\/\\\/infosecml.com\\\/wp-content\\\/uploads\\\/2020\\\/11\\\/som.png\",\"keywords\":[\"Deep learning\",\"Self Organising Maps\",\"UEBA\"],\"articleSection\":[\"Deep Learning\",\"Machine Learning\"],\"inLanguage\":\"en-GB\",\"potentialAction\":[{\"@type\":\"CommentAction\",\"name\":\"Comment\",\"target\":[\"https:\\\/\\\/infosecml.com\\\/index.php\\\/2020\\\/11\\\/01\\\/so-you-want-to-build-your-own-deep-learning-solution\\\/#respond\"]}]},{\"@type\":\"WebPage\",\"@id\":\"https:\\\/\\\/infosecml.com\\\/index.php\\\/2020\\\/11\\\/01\\\/so-you-want-to-build-your-own-deep-learning-solution\\\/\",\"url\":\"https:\\\/\\\/infosecml.com\\\/index.php\\\/2020\\\/11\\\/01\\\/so-you-want-to-build-your-own-deep-learning-solution\\\/\",\"name\":\"Build your own Deep Learning UEBA system? - InfoSecML\",\"isPartOf\":{\"@id\":\"https:\\\/\\\/infosecml.com\\\/#website\"},\"primaryImageOfPage\":{\"@id\":\"https:\\\/\\\/infosecml.com\\\/index.php\\\/2020\\\/11\\\/01\\\/so-you-want-to-build-your-own-deep-learning-solution\\\/#primaryimage\"},\"image\":{\"@id\":\"https:\\\/\\\/infosecml.com\\\/index.php\\\/2020\\\/11\\\/01\\\/so-you-want-to-build-your-own-deep-learning-solution\\\/#primaryimage\"},\"thumbnailUrl\":\"https:\\\/\\\/infosecml.com\\\/wp-content\\\/uploads\\\/2020\\\/11\\\/som.png\",\"datePublished\":\"2020-11-01T12:26:02+00:00\",\"dateModified\":\"2022-04-07T14:57:13+00:00\",\"description\":\"Building a real Deep Learning security solution may not be as hard as you think. This article describes a project to identify low and slow attacks based upon anomalous behaviour of the attacker.\",\"breadcrumb\":{\"@id\":\"https:\\\/\\\/infosecml.com\\\/index.php\\\/2020\\\/11\\\/01\\\/so-you-want-to-build-your-own-deep-learning-solution\\\/#breadcrumb\"},\"inLanguage\":\"en-GB\",\"potentialAction\":[{\"@type\":\"ReadAction\",\"target\":[\"https:\\\/\\\/infosecml.com\\\/index.php\\\/2020\\\/11\\\/01\\\/so-you-want-to-build-your-own-deep-learning-solution\\\/\"]}]},{\"@type\":\"ImageObject\",\"inLanguage\":\"en-GB\",\"@id\":\"https:\\\/\\\/infosecml.com\\\/index.php\\\/2020\\\/11\\\/01\\\/so-you-want-to-build-your-own-deep-learning-solution\\\/#primaryimage\",\"url\":\"https:\\\/\\\/infosecml.com\\\/wp-content\\\/uploads\\\/2020\\\/11\\\/som.png\",\"contentUrl\":\"https:\\\/\\\/infosecml.com\\\/wp-content\\\/uploads\\\/2020\\\/11\\\/som.png\",\"width\":2870,\"height\":1137,\"caption\":\"Self Organising Map\"},{\"@type\":\"BreadcrumbList\",\"@id\":\"https:\\\/\\\/infosecml.com\\\/index.php\\\/2020\\\/11\\\/01\\\/so-you-want-to-build-your-own-deep-learning-solution\\\/#breadcrumb\",\"itemListElement\":[{\"@type\":\"ListItem\",\"position\":1,\"name\":\"Home\",\"item\":\"https:\\\/\\\/infosecml.com\\\/\"},{\"@type\":\"ListItem\",\"position\":2,\"name\":\"Build your own Deep Learning UEBA system?\"}]},{\"@type\":\"WebSite\",\"@id\":\"https:\\\/\\\/infosecml.com\\\/#website\",\"url\":\"https:\\\/\\\/infosecml.com\\\/\",\"name\":\"InfoSecML\",\"description\":\"The home of Machine Learning and Advanced Analytics for Information Security\",\"publisher\":{\"@id\":\"https:\\\/\\\/infosecml.com\\\/#\\\/schema\\\/person\\\/f11f4ea3133147a580202f13b6da27e8\"},\"potentialAction\":[{\"@type\":\"SearchAction\",\"target\":{\"@type\":\"EntryPoint\",\"urlTemplate\":\"https:\\\/\\\/infosecml.com\\\/?s={search_term_string}\"},\"query-input\":{\"@type\":\"PropertyValueSpecification\",\"valueRequired\":true,\"valueName\":\"search_term_string\"}}],\"inLanguage\":\"en-GB\"},{\"@type\":[\"Person\",\"Organization\"],\"@id\":\"https:\\\/\\\/infosecml.com\\\/#\\\/schema\\\/person\\\/f11f4ea3133147a580202f13b6da27e8\",\"name\":\"Steve Gailey\",\"logo\":{\"@id\":\"https:\\\/\\\/infosecml.com\\\/#\\\/schema\\\/person\\\/image\\\/\"},\"sameAs\":[\"http:\\\/\\\/infosecml.com\"]}]}<\/script>\n<!-- \/ Yoast SEO plugin. -->","yoast_head_json":{"title":"Build your own Deep Learning UEBA system? - InfoSecML","description":"Building a real Deep Learning security solution may not be as hard as you think. This article describes a project to identify low and slow attacks based upon anomalous behaviour of the attacker.","robots":{"index":"index","follow":"follow","max-snippet":"max-snippet:-1","max-image-preview":"max-image-preview:large","max-video-preview":"max-video-preview:-1"},"canonical":"https:\/\/infosecml.com\/index.php\/2020\/11\/01\/so-you-want-to-build-your-own-deep-learning-solution\/","og_locale":"en_GB","og_type":"article","og_title":"Build your own Deep Learning UEBA system? - InfoSecML","og_description":"Building a real Deep Learning security solution may not be as hard as you think. This article describes a project to identify low and slow attacks based upon anomalous behaviour of the attacker.","og_url":"https:\/\/infosecml.com\/index.php\/2020\/11\/01\/so-you-want-to-build-your-own-deep-learning-solution\/","og_site_name":"InfoSecML","article_published_time":"2020-11-01T12:26:02+00:00","article_modified_time":"2022-04-07T14:57:13+00:00","og_image":[{"width":2870,"height":1137,"url":"https:\/\/infosecml.com\/wp-content\/uploads\/2020\/11\/som.png","type":"image\/png"}],"author":"Steve Gailey","twitter_card":"summary_large_image","twitter_misc":{"Written by":"Steve Gailey","Estimated reading time":"13 minutes"},"schema":{"@context":"https:\/\/schema.org","@graph":[{"@type":"Article","@id":"https:\/\/infosecml.com\/index.php\/2020\/11\/01\/so-you-want-to-build-your-own-deep-learning-solution\/#article","isPartOf":{"@id":"https:\/\/infosecml.com\/index.php\/2020\/11\/01\/so-you-want-to-build-your-own-deep-learning-solution\/"},"author":{"name":"Steve Gailey","@id":"https:\/\/infosecml.com\/#\/schema\/person\/f11f4ea3133147a580202f13b6da27e8"},"headline":"Build your own Deep Learning UEBA system?","datePublished":"2020-11-01T12:26:02+00:00","dateModified":"2022-04-07T14:57:13+00:00","mainEntityOfPage":{"@id":"https:\/\/infosecml.com\/index.php\/2020\/11\/01\/so-you-want-to-build-your-own-deep-learning-solution\/"},"wordCount":2284,"commentCount":0,"publisher":{"@id":"https:\/\/infosecml.com\/#\/schema\/person\/f11f4ea3133147a580202f13b6da27e8"},"image":{"@id":"https:\/\/infosecml.com\/index.php\/2020\/11\/01\/so-you-want-to-build-your-own-deep-learning-solution\/#primaryimage"},"thumbnailUrl":"https:\/\/infosecml.com\/wp-content\/uploads\/2020\/11\/som.png","keywords":["Deep learning","Self Organising Maps","UEBA"],"articleSection":["Deep Learning","Machine Learning"],"inLanguage":"en-GB","potentialAction":[{"@type":"CommentAction","name":"Comment","target":["https:\/\/infosecml.com\/index.php\/2020\/11\/01\/so-you-want-to-build-your-own-deep-learning-solution\/#respond"]}]},{"@type":"WebPage","@id":"https:\/\/infosecml.com\/index.php\/2020\/11\/01\/so-you-want-to-build-your-own-deep-learning-solution\/","url":"https:\/\/infosecml.com\/index.php\/2020\/11\/01\/so-you-want-to-build-your-own-deep-learning-solution\/","name":"Build your own Deep Learning UEBA system? - InfoSecML","isPartOf":{"@id":"https:\/\/infosecml.com\/#website"},"primaryImageOfPage":{"@id":"https:\/\/infosecml.com\/index.php\/2020\/11\/01\/so-you-want-to-build-your-own-deep-learning-solution\/#primaryimage"},"image":{"@id":"https:\/\/infosecml.com\/index.php\/2020\/11\/01\/so-you-want-to-build-your-own-deep-learning-solution\/#primaryimage"},"thumbnailUrl":"https:\/\/infosecml.com\/wp-content\/uploads\/2020\/11\/som.png","datePublished":"2020-11-01T12:26:02+00:00","dateModified":"2022-04-07T14:57:13+00:00","description":"Building a real Deep Learning security solution may not be as hard as you think. This article describes a project to identify low and slow attacks based upon anomalous behaviour of the attacker.","breadcrumb":{"@id":"https:\/\/infosecml.com\/index.php\/2020\/11\/01\/so-you-want-to-build-your-own-deep-learning-solution\/#breadcrumb"},"inLanguage":"en-GB","potentialAction":[{"@type":"ReadAction","target":["https:\/\/infosecml.com\/index.php\/2020\/11\/01\/so-you-want-to-build-your-own-deep-learning-solution\/"]}]},{"@type":"ImageObject","inLanguage":"en-GB","@id":"https:\/\/infosecml.com\/index.php\/2020\/11\/01\/so-you-want-to-build-your-own-deep-learning-solution\/#primaryimage","url":"https:\/\/infosecml.com\/wp-content\/uploads\/2020\/11\/som.png","contentUrl":"https:\/\/infosecml.com\/wp-content\/uploads\/2020\/11\/som.png","width":2870,"height":1137,"caption":"Self Organising Map"},{"@type":"BreadcrumbList","@id":"https:\/\/infosecml.com\/index.php\/2020\/11\/01\/so-you-want-to-build-your-own-deep-learning-solution\/#breadcrumb","itemListElement":[{"@type":"ListItem","position":1,"name":"Home","item":"https:\/\/infosecml.com\/"},{"@type":"ListItem","position":2,"name":"Build your own Deep Learning UEBA system?"}]},{"@type":"WebSite","@id":"https:\/\/infosecml.com\/#website","url":"https:\/\/infosecml.com\/","name":"InfoSecML","description":"The home of Machine Learning and Advanced Analytics for Information Security","publisher":{"@id":"https:\/\/infosecml.com\/#\/schema\/person\/f11f4ea3133147a580202f13b6da27e8"},"potentialAction":[{"@type":"SearchAction","target":{"@type":"EntryPoint","urlTemplate":"https:\/\/infosecml.com\/?s={search_term_string}"},"query-input":{"@type":"PropertyValueSpecification","valueRequired":true,"valueName":"search_term_string"}}],"inLanguage":"en-GB"},{"@type":["Person","Organization"],"@id":"https:\/\/infosecml.com\/#\/schema\/person\/f11f4ea3133147a580202f13b6da27e8","name":"Steve Gailey","logo":{"@id":"https:\/\/infosecml.com\/#\/schema\/person\/image\/"},"sameAs":["http:\/\/infosecml.com"]}]}},"_links":{"self":[{"href":"https:\/\/infosecml.com\/index.php\/wp-json\/wp\/v2\/posts\/55","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/infosecml.com\/index.php\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/infosecml.com\/index.php\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/infosecml.com\/index.php\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/infosecml.com\/index.php\/wp-json\/wp\/v2\/comments?post=55"}],"version-history":[{"count":17,"href":"https:\/\/infosecml.com\/index.php\/wp-json\/wp\/v2\/posts\/55\/revisions"}],"predecessor-version":[{"id":571,"href":"https:\/\/infosecml.com\/index.php\/wp-json\/wp\/v2\/posts\/55\/revisions\/571"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/infosecml.com\/index.php\/wp-json\/wp\/v2\/media\/372"}],"wp:attachment":[{"href":"https:\/\/infosecml.com\/index.php\/wp-json\/wp\/v2\/media?parent=55"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/infosecml.com\/index.php\/wp-json\/wp\/v2\/categories?post=55"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/infosecml.com\/index.php\/wp-json\/wp\/v2\/tags?post=55"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}