Privacy of business data by design
By Nabendu Das and Dr Kalpit Desai
The tabling of the Personal Data Protection Bill in 2019 has brought on a flurry of discourse on how India’s tech industry should follow Privacy by Design when dealing with personal data and non-personal data. The current article talks about privacy of business data. A different set of design perspectives emerge while considering how tech companies ensure the privacy of the data of their business customers. All businesses need to be supported by tech companies that keep data privacy at the forefront of their offerings, enabling the business to have complete control of their data while embracing technology in a secure manner.
Distinguishing between private vs public data, for a business
Before diving into the need for data privacy, it is important to understand what private data is in the context of a business. The bulk of all information, which is used and processed by businesses are implicitly public data. For example, the existence of a brand, a product, a company is public data, since it is traded across multiple businesses. Business confidential information like financials, godown, inventory, bill of materials etc. are private. A data point that violates anonymity is considered private; some examples being PAN no, TIN no, bank account no etc. A master that does not participate in external transactions, participates only in internal transactions, can be potentially considered as private; as it carries information about a company’s internal processes.
Identifying data to be private vs public is a very important consideration that should be taken conservatively — when in doubt about a specific data point, assume it to be private.
Importance of Privacy by Design – for business data
For businesses dealing with personal data, an important facet in complying with data privacy laws is to “get consent” to access or store customer data needed to provide services. However, the notion of ‘getting consent’ is not applicable in the same manner when your customer is a business. Hence, the onus to articulate and practice concepts of data privacy for business customers lies with enterprises that cater to these business customers.
Similarly, it is important for a business to have full control over access of their data (e.g. financial data) to guard their competitiveness. Therefore, enterprises serving business customers should have a transparent and well-defined process for handing the business customers’ data so that the business customers are always both aware of and in control of, who can access their data.
Upfront focus on data privacy is a perceptible competitive advantage. Such attention to data privacy from day one can ensure that the company’s products are all built with the data privacy considerations put at the core from the ‘design’ stage itself, instead of, being bolted-on at a later stage. For example, such focus on privacy by design is arguably among the drivers that give Apple a competitive edge and help achieve premium positioning of Apple devices in the computing and smartphone market.
Principles of Data Privacy by Design (DPbD)
For businesses to truly safeguard data, tech companies must offer solutions that ensures that control of data is fully with the customer. Tech companies must incorporate certain core principles while designing a system, to handle their customer’s data in a way that fully protects the privacy. Tally Solutions is an Indian multinational company that provides enterprise resource planning software. The software handles accounting, inventory management, tax management, payroll etc. and is used by nearly 2 million customers. Much of the architecture of the current product is premise-based, therefore the control of customer data is fully with the customer, including the remote services that it provides. Tally uses a set of principles while designing a system to handle their customer’s data in a way that fully protects the privacy – whether the data is residing at customer’s premises, is traveling to Tally’s backend systems, traveling through Tally’s backend systems to 3rd party systems, or is stored in Tally backend systems.
Below are a set of design principles for data privacy. The ‘backend systems’ here refers to the cloud based backend system maintained by the software product provider.
- Customer data will lie on devices and on backend systems in the cloud: To ensure privacy of the data regardless of where it is stored, products must have:
- Built-in access control mechanism is used to control who can see and operate what part of the data. Level of control is completely at customer’s discretion – from no control, to a level of control that the customer wants to set.
- Encryption mechanism available in the product that the customer can use optionally. It is designed so that only the customer can read the data, and anyone without the actual password, even if he/she is the application developer, cannot read the data.
- Customer data will move between their devices and the backend systems: A business customer’s data may need to move among their various devices and also between their devices and the backend systems. To ensure privacy of the data in transit:
- As data travels between client premise and backend systems, it needs to be protected from man-in-the-middle attacks as it passes through the internet. Data in Transit is protected by modern and dynamic protocols to prevent any external ability to sniff the data or interfere with it.
- Customer payload should be additionally encrypted for being able to be deciphered only at recipient endpoint, and not through any of the backend systems it may need to pass through for routing purposes
- Customers will integrate with third parties: A business customer’s data may need to travel through the backend systems to avail services from various third parties. To ensure privacy of the data while it is passing through the backend:
- No data which can potentially be decrypted and/or deciphered in the backend should ever be put on disk. This ensures that neither the software developer nor the operator can open the data in the backend.
- Only the metadata which is used for routing and/or correlation of request-response is stored – and not any meaningful content of the customer data (which, in any case, can only be decrypted at the end-point of the recipient)
- The software may need to log information for the purpose of compliance, or troubleshooting etc.
- No logging of any customer related data that is not deemed as implicitly public
- Compliance related logging will happen in case it is sought by relevant authorities. In such cases, operational care will be put in place to ensure that only authorized people can access this log, and all such accesses are logged.
- Customers will avail data-based services including analytics: For any business customer’s data that is used to enrich the analytics database, enterprises must follow a set of rules to ensure that such decipherable data received is never identifiable and is only in anonymized and aggregated form:
- Data can never be pulled by the backend systems; they can only be pushed by the enterprise’s client software running on the customer’s premises
- Any data that is private as per the definition above does not travel to the backend system
- Anonymization and aggregation to mask the identity of the source
- The IP address of the source is not logged anywhere in the backend. Thus, it creates an irreversible path for flow of information, as far as tracing back to source is concerned.
- Before any data is pushed to the backend, the client software running on the customer’s premises anonymizes and aggregates the data.
Understanding data about how the product is used is useful to provide deep insights, that results into improvement in product design. Using anonymization techniques while gathering such data, ensures that customer identity cannot be reverse engineered in the backend, thus completely protecting customer’s privacy.
The Way Forward
The process of engineering a software system that adheres to DPbD often challenges the broadly accepted ‘standard practices’. Generating limited or no logs, elimination of ‘root’ access post commissioning the system, etc. are some examples. These implications are usually not obvious or appreciated by all the stakeholders in the workforce from the get-go. Deliberate and coordinated efforts in educating stakeholders about these principles would greatly help in generating the necessary buy-in across the board.
Data privacy should take a clear precedence over any other business considerations such as market demand, profit, growth potential etc. which is in contrast with the popular industry practice observed so far where the data privacy is only given a secondary attention. It would also help to invest upfront in brainstorming sessions that bring the product management team, the data analytics team and the software engineering team on the same page about the target subset of analytics products which are within the DPbD boundaries while also promising enough to generate adequate business value. The mentioned principles should be greatly considered by enterprises for technology and engineering related decisions and choices when building a product or service that deals with customers’ data.
(Nabendu Das is Head of Engineering, Tally Solutions; Dr Kalpit Desai is Founder & Chief Data Scientist, Datakalp LLP)
If you have an interesting article / experience / case study to share, please get in touch with us at [email protected]