مشاركة مميزة
Launch HN: Humanloop (YC S20) – A platform to annotate, train and deploy NLP https://ift.tt/3hPLoiS
- Get link
- X
- Other Apps
Launch HN: Humanloop (YC S20) – A platform to annotate, train and deploy NLP Hey HN. We’re Peter, Raza and Jordan of Humanloop ( https://humanloop.com ) and we’re building a low code platform to annotate data, rapidly train and then deploy Natural Language Processing (NLP) models. We use active learning research to make this possible with 5-10x less labelled data. We’ve worked on large machine learning products in industry (Alexa, text-to-speech systems at Google and in insurance modelling) and seen first-hand the huge efforts required to get these systems trained, deployed and working well in production. Despite huge progress in pretrained models (BERT, GPT-3), one of the biggest bottlenecks remains getting enough _good quality_ labelled data. Unlike annotations for driverless cars, the data that’s being annotated for NLP often requires domain expertise that’s hard to outsource. We’ve spoken to teams using NLP for medical chat bots, legal contract analysis, cyber security monitoring and customer service, and it’s not uncommon to find teams of lawyers or doctors doing text labelling tasks. This is an expensive barrier to building and deploying NLP. We aim to solve this problem by providing a text annotation platform that trains a model as your team annotates. Coupling data annotation and model training has a number of benefits: 1) we can use the model to select the most valuable data to annotate next – this “active learning” loop can often reduce data requirements by 10x 2) a tight iteration cycle between annotation and training lets you pick up on errors much sooner and correct annotation guidelines 3) as soon as you’ve finished the annotation cycle you have a trained model ready to be deployed. Active learning is far from a new idea, but getting it to work well in practice is surprisingly challenging, especially for deep learning. Simple approaches use the ML models’ predictive uncertainty (the entropy of the softmax) to select what data to label... but in practice this often selects genuinely ambiguous or “noisy” data that both annotators and models have a hard time handling. From a usability perspective, the process needs to be cognizant of the annotation effort, and the models need to quickly update with new labelled data, otherwise it’s too frustrating to have a human-in-the-loop training session. Our approach uses Bayesian deep learning to tackle these issues. Raza and Peter have worked on this in their PhDs at University College London alongside fellow cofounders David and Emine [1, 2]. With Bayesian deep learning, we’re incorporating uncertainty in the parameters of the models themselves, rather than just finding the best model. This can be used to find the data where the model is uncertain, not just where the data is noisy. And we use a rapid approximate Bayesian update to give quick feedback from small amounts of data [3]. An upside of this is that the models have well-calibrated uncertainty estimates -- to know when they don’t know -- and we’re exploring how this could be used in production settings for a human-in-the-loop fallback. Since starting we’ve been working with data science teams at two large law firms to help build out an internal platform for cyber threat monitoring and data extraction. We’re now opening up the platform to train text classifiers and span-tagging models quickly and deploy them to the cloud. A common use case is for classifying support tickets or chatbot intents. We came together to work on this because we kept seeing data as the bottleneck for the deployment of ML and were inspired by ideas like Andrej Karpathy’s software 2.0 [4]. We anticipate a future in which the barriers to ML deployment become sufficiently lowered that domain experts are able to automate tasks for themselves through machine teaching and we view data annotation tools as a first step along this path. Thanks for reading. We love HN and we’re looking forward to any feedback, ideas or questions you may have. [1] https://ift.tt/3hK2xus – a scalable approach to estimates uncertainty in deep learning models [2] https://ift.tt/39FoLLa work to combine uncertainty together with representativeness when selecting examples for active learning. [3] https://ift.tt/39B0GFo – a simple Bayesian approach to learn from few data [4] https://ift.tt/2hsOCzx July 29, 2020 at 04:57PM
- Get link
- X
- Other Apps
Popular posts from this blog
How do you build your body in 1 week?
How do you build your body in 1 week? Ready to see real changes in just seven days? This guide shows how to kick start your body's muscle development . You'll learn about targeted workouts, smart nutrition, and wellness habits. Even in a week, small adjustments can spark noticeable improvements in strength and energy. Edit Full screen View original Delete Body health This plan helps lay the groundwork for lasting progress by focusing on exercise routines, meal planning, and rest. Every step is designed to maximize your body’s potential in a short time frame. Key Takeaways Body...
استفتاء - A REFERNDUM.
استفتاء . ******* استفتاء حيران ... العقل أم الجنان ؟.. ألمس نار الحنان .. وتحيط بكل جانب .. بأدق ما في الزوايا .. تسكن مني الحنايا.. تدعو والقلب راغب .. يملأ حولي المكان .. ويدق مثل الكمان .. فتفيض فيها الحِسان .. أتُراها لا تُبالي ؟؟.. أتحس حقا بِبالي ؟؟.. أرفض منها العناد .. فيعاتبني الفؤاد .. ويواسيني خيالي .. ما جدوى أن تُحاسب ؟.. تبغي مثلي الأمان .. إذّ يجمعنا أوان .. لا نخشاه الزمان .. وتقلبات الليالي .. أبلغ منها منالي .. وتُكللنا مواكب .. وتنال مني كتاب .. لِتُوَحدُنا رواية .. وتُصورُنا مرايا .. ونظل في كل آن .. نجمان يسبحان .. ويُقِرُنا الاثنان .. العقلُ والجِنان . بقلمي السيد أحمد إبراهيم 16\11\1994 ترجمة قصيدة استفتاء. A referendum. ******* Wondered referendum ... The mind or the heavens? Touch the fondness of fire. Surrounded by each side. As exact in the corners. Live in all my curves. Invites the heart wants. Fill around the place. Beats like a violin. It is overflowing with grace. is she Don't care?? It fell off my mind ???. I reject such stubbornness. Then m...
H. Welton Flynn: A Pioneer in Public Service
H. Welton Flynn: A Pioneer in Public Service By Jeremy Menzies This February for Black History Month we're highlighting one of San Francisco's most important figures in transportation, H. Welton Flynn. Flynn is best known for his key leadership at the SF Public Utilities Commission and as the inaugural chairman of the SFMTA Board of Directors. Mr. Flynn was the first African American appointed to a city commission and served the longest term of any commissioner in San Francisco’s history. He served under six different mayors, being elected to the office of President or Chairman more than a dozen times. Welton Flynn in 1978 during the inaugural run of Muni light rail in the Market Street Subway. Flynn’s leadership on the Public Utilities Commission was crucial during this era of change and adoption of new technology in San Francisco transit. Flynn's long tenure in transit began with his appointment to the Public Utilities Commission, which oversaw Muni operations, in 19...
Comments
Post a Comment