Blog
Sometimes we take a break from building cutting edge AI redaction models to stretch our academic muscles and write about privacy and machine learning. Check back here regularly for our musings.
Demystifying De-identification: Understanding key tech for data protection regulation compliance
Anonymization vs de-identification vs redaction vs pseudonymization vs tokenization
What It Really Takes to Build An AI System: It’s more complicated than many think
There’s a saying ‘the last 20% of the work takes 80% of the time’ and nowhere is that more true than AI systems.
Natural Language v. Regex: The Context wars
Say you’re looking for credit card numbers. It’s quite easy to set up a regex that looks for 16-digit numbers or four groups of four numbers separated by a ‘-’. A regex like this is highly effective in the perfect world of computer data, but unfortunately the real world is much more complicated.
Cybersecurity and Privacy: Complements for a more secure Internet
There exists a vibrant ecosystem of specialized security tools. The sad truth is that it is almost impossible to reach 100% invulnerability. What can we do to get closer?
Customers Are Demanding Privacy
In the past three years there has been a massive wake-up in customer awareness about privacy. Many customers are now refactoring how they buy, taking their business elsewhere if they don’t trust a company’s data practices.
Privacy Enhancing Technologies Decision Tree (v2)
Privacy Enhancing Technologies Decision Tree:
for developers, managers, and founders looking to
integrate privacy into their software pipelines
and products.
Liability & AI Malfunction: An AI System Developer’s Perspective
AI is rapidly being deployed around the world with few to follow. Along with the complexity of creating the technology, there remain many unanswered legal questions.
Accelerating Tensorflow Lite with XNNPACK
The new Tensorflow Lite XNNPACK delegate enables best in-class performance on x86 and ARM CPUs — over 10x faster than the default Tensorflow Lite backend in some cases.
NVIDIA DALI: Speeding up PyTorch
Some techniques to improve DALI resource usage & create a completely CPU-based pipeline.
Perfectly Privacy-Preserving Artificial Intelligence
We introduce the four pillars required to achieve perfectly privacy-preserving AI and discuss various technologies that can help address each of the pillars.
Homomorphic Encryption for Beginners: A Practical Guide (Part 2)
We discuss a practical application of homomorphic encryption to privacy-preserving signal processing, particularly focusing on the Fourier transform.
Homomorphic Encryption for Beginners: A Practical Guide (Part 1)
We cover the basics of homomorphic encryption, followed by a brief overview of open source HE libraries and a tutorial on how to use one of those libraries (namely, PALISADE).
Why is Privacy-Preserving Natural Language Processing Important?
A number of people ask us why we should bother creating NLP tools that preserve privacy. Apparently not everyone spends hours thinking about data breaches and privacy infringements.
A Brief Overview of Privacy-Preserving Software Methods
A very brief overview of privacy-preserving technologies follows for anyone who’s interested in starting out in this area. I cover symmetric encryption, asymmetric encryption, homomorphic encryption, differential privacy, and secure multi-party computation.