I am a Senior Lead Data Scientist at LexisNexis (a legal AI leader), working in the areas of Natural Language Processing (NLP) and Large Language Models (LLMs). I lead the Data Science and AI for a stealth product that intelligently unifies workflow and legal research to transform how counsel accomplishes work. Our product will lead the way in integrating “workflow” and “content delivery platforms” seamlessly in novel ways to provide a garden path to structure legal work and generate insight.
Before Lexis, as a Team Manager and Lead Principal Scientist, I led the Artificial Intelligence Group for the Carol Data and AI platform at TOTVS Labs USA (a “startup” lab focusing on Computer Vision, NLP, and ERP/Tabular Analytics). My team was responsible for the innovation as well as the implementation of features for products such as Carol ClockIn (facial recognition for employee clockin), Carol Assistant (AI for conversational experience), and Deep Audit (insurance claims management with neural attention models). These products have a market leading presence in the Latin Americas.
Prior to TOTVS, I was a Machine Learning Tech Lead and Lead Principal Scientist at ABB (AI for power, robotics, and automation sectors). My innovations, projects, and technical leadership significantly contributed to and influenced the formation of ABB Ability.
My areas of work and interest in AI include NLP, Large Language Models (LLMs), Reinforcement Learning, and Computer Vision
(For a full list, see Google Scholar and Patents)
Machine Learning and Data Mining
a) Intelligently unifying workflow and research: transforming how counsel accomplishes work. LexTech b) Intelligent recommendations for legal matter management c) Bots with Semantic Search and Language Understanding. Machine Learning and AI in Search, RELX Search Summit.
Considering neighborhood structural similarity in Non-negative Matrix Factorization (NMF) makes NMF well suited for timeseries anomaly detection. Journal of Machine Learning Research (JMLR).
Deep Audit: Neural attention models on tabular and relational data for insurance claims decision automation. US patent pending. US2022/0156573 A1
ClockIn: Personnel time clock in with computer vision (facial verification and recognition). US patent pending. US 2022/0076506 A1.
Computer vision or managing safety at industrial sites (on smartphones, video cameras, and edge devices). US Patent 10,573,147, etc.
a) Real-time AI powered by edge-deployed digital twins (Edge computing and analytics in synergy with the cloud). b) Managing solar asset performance with connected analytics. ABB Review. Digital Twins and Simulation Edition.
Technologies for decentralized fleet analytics (How can each customer learn and benefit from the wisdom of the entire fleet of customers data without any customer sharing data with the central cloud to which all customers are connected?). US, etc. patent pending.
Cyber-attack detection with machine learning for networked electrical power system devices. US, etc. patents pending.
Location-aware analytics for industrial sites. US Patent 10,520,927, etc.
Technologies for solar power system performance model tuning with machine learning. WO, etc. patents pending.
Code Drones (On intelligent and socially active software artifacts that guide their own self-improvement; AI Bots for Software Engineering). See paper here. Visions Track at International Conference on Software Engineering (ICSE Visions). Best paper, runner up.
Technologies for optimizing power grids through decentralized forecasting. US, etc. patents pending.
Industrial equipment installation (seamless information model updates for parts replacement; spare parts and inventory management). US Patent 10,331,119, etc.
Technologies for producing training data (using techniques such as Generative Adversarial Networks (GANs)) for identifying degradation of physical components. US, etc. patents pending.
Systems and methods for identifying anomalous events for electrical systems. WO, etc. patents pending.
Machine Learning based real-time intrusion detection using processor execution timing information on embedded systems. Workshop at Real-time Systems Symposium (RTSS Workshop).
Data mining and graph analytics techniques for industrial alarm management. A series of patents and filings resulted from this work: US Patent 10,523,495, EP Patent 3 187 950, WO/2016/141007, US20190165989 (also published as CN111656418, WO/2019/104296, and EP3718093).
Diagnosis method and apparatus (analyzing logs from one or more robots for failure root cause clustered in time). US, etc. patents pending.
Mining API Specifications from source code for improving software reliability. PhD dissertation.
Machine learning for performance monitoring of services. Automated Software Engineering (ASE).
Static API Specification Mining: Exploiting source code model checking. Book Chapter. Mining Software Specifications: Methodologies and Applications.
Mining API error-handling specifications from source code. Fundamental Approaches to Software Engineering (FASE).
Mining API patterns as partial orders from source code: from usage scenarios to specifications. Foundations of Software Engineering (FSE).
Imp: A change impact analysis tool (Visual Studio plugin) for C/C++ that integrates with version control and build systems. Foundations of Software Engineering Tool Demo (FSE Tools)
Oracle-based regression test selection. International Conference on Software Testing (ICST).
Configuration selection using code change impact analysis for regression testing. International Conference on Software Maintenance (ICSM).
Impact analysis of configuration changes for test case selection. International Symposium on Software Reliability Engineering (ISSRE).
Practical change impact analysis based on static program slicing for industrial software systems. Industry Track at International Conference on Software Engineering (ICSE Practice).
Intelligent jamming in wireless networks with applications to 802.11b and other networks. Military Communications Conference (MILCOM). Nominee, Fred W. Ellersick best paper award.
Method for distributing keys for encrypted data transmission in wireless sensor networks. US Patent 7,702,905 (+DE/JP patents).
Secure comparison of encrypted data in wireless sensor networks. Wireless Modeling and Optimization Symposium (WiOpt).
Concealed data aggregation for reverse multicast traffic in sensor networks: Encryption, key distribution, and routing adaptation. IEEE Transactions on Mobile Computing. Featured article