Abstract

Big data analytics (BDA) have become a crucial and effective tool employed by organisations and are intertwined with many aspects of individuals’ daily lives, with benefits in areas such as automated decision making and service innovation. However, as big data continues to grow, the use and spread of BDA increasingly raises ethical concerns about privacy, data accuracy, accessibility, and bias. This paper aims to discuss the issues present in each of these areas, propose how ethical standards can be maintained, and highlight specific case studies and their impacts. On privacy, the article considers the selling of user data, surveillance by companies for the purpose of improving BDA, and the importance of informed consent. Ethical considerations regarding the significance of accurate, high-quality, and transparent data which is accessible to all social groups — particularly educators and researchers — is also emphasised. Following this, the paper addresses the issues of inadequate or unclear data ownership legislation, discrimination in BDA tools (notably algorithms), and the role of human responsibility in the BDA process. As the potential of BDA only increases with the development of AI and quantum computing, the paper finishes by exploring future applications and research opportunities.

1. Introduction

1.1 Big Data

In today’s rapidly evolving digital landscape, the proliferation of data and technology has given rise to the concept of big data, fundamentally transforming the way we understand and interact with the world. Data, in its simplest form, refers to a collection of figures, facts, and statistics that can be used for reference or analysis, and is typically organised into structured formats, such as tables and databases (Cambridge Dictionary, 2019). However, the concept of big data extends far beyond traditional data. Big data represents a shift in how information is gathered, processed, and analysed. Unlike conventional datasets, which are often limited in size and scope, big data encompasses vast volumes of information collected from a multitude of sources, varying significantly in structure, format, and reliability (Abdalla, 2022).

The distinction between data and big data is highlighted by the 4Vs framework, which is frequently used to define the latter: Volume, Velocity, Variety, and Veracity (Gomathy, 2023). Volume refers to the sheer amount of data being generated; it is of a scale so vast that traditional data management tools struggle to keep up. Velocity pertains to the speed at which data is collected, processed, and analysed (Gomathy, 2023). In today’s digital era, data is generated at an unprecedented pace, necessitating real-time or near-real-time processing to extract valuable insights. Variety reflects the diversity of data formats, which range from structured datasets, like databases, to unstructured formats, such as text, images, and videos (Gomathy, 2023). Veracity, the final V, concerns the accuracy and trustworthiness of the data, which is critical for ensuring reliable and meaningful analysis (Gomathy, 2023).

The difference between traditional data and big data is not solely a matter of scale; it also reflects a qualitative shift in how data is perceived and utilised. This shift has been driven by the rapid expansion of data in recent years. With the advent of the digital age, the volume of data has grown exponentially, reaching levels that far exceed the capabilities of traditional data management systems. As Uma (2020) notes, data now expands from terabytes to petabytes and even exabytes, with digital data doubling approximately every two years. This exponential growth is a testament to the changing nature of data itself, which has evolved from relatively static and manageable datasets into dynamic and ever-expanding streams of information.

Several factors have contributed to the dramatic increase in data volume, the most significant of which being the widespread adoption of digital devices in everyday life. Smartphones, tablets, wearable technology, and other connected devices continuously generate data, capturing every interaction, transaction, and activity (Institute of Data, 2023). This constant data generation is further augmented by technological advancements that allow for the collection of diverse and multimodal data from a variety of sources. These include social media platforms, sensors embedded in smart devices, online transactions, and more. The result is the creation of extensive big data datasets that are characterised by their complexity and diversity (Uma, 2020). By 2020, it was estimated that 1.7 megabytes of new information were being created every second for every person on the planet, illustrating the staggering scale of data production in the modern world (Uma, 2020).

As data evolves into its larger and more complex form of big data, its impact on various aspects of our lives has grown in parallel. Big data is used in big data analytics (BDA) and has become a cornerstone of modern business strategies and decision-making processes. Its ability to provide deep, actionable insights has made it invaluable for organisations across industries. Companies now leverage big data to drive innovation, enhance operational efficiency, and deliver more personalised customer experiences. The rise of the digital era has ushered in a new wave of data-driven decision making, where organisations use big data to identify patterns, trends, and correlations within vast datasets. These insights are not just beneficial; they are often essential for maintaining a competitive edge in today’s fast-paced market (Institute of Data, 2023).

Given its transformative potential, it is essential to delve deeper into the advantages of big data. The next section will explore these benefits in detail, highlighting how big data is driving innovation, improving operational efficiency, and reshaping industries. By understanding the advantages of big data, we can better appreciate its role in the digital transformation that is redefining our world.

1.2 Advantages of Big Data

As mentioned in the previous section, big data plays a crucial role in furthering modern business strategies by enabling informed decision making, driving innovation, and providing a competitive advantage through the analysis of patterns and trends in vast amounts of data. This section focuses and expands on four advantages of big data, such as improvements in decision making, customer experience, product/service development, and risk management.

One of the primary advantages of big data is its ability to enhance decision-making processes. Due to its direct impact on the development of businesses and companies, big data has been considered by experts and company owners with the aim of increasing the efficiency of important business decisions (Salari et al., 2022). When companies can manage and analyse their big data, they can discover patterns and unlock insights that improve and drive better operational and strategic decisions (Google Cloud, n.d.). For example, in an online learning environment, the strengths and weaknesses of the learners and if they will pass or fail can be identified. This can be achieved while they are playing educational games, engaging in off-task behaviour, when failing to answer a question correctly despite having the required skill, or taking a long time to answer (Bamiah et al., 2018).

Another major advantage of big data is its impact on improving customer experience. Businesses now have the ability to manage and interpret vast amounts of customer data through the use of big data. This data includes purchasing history, social media interactions, browsing behaviour, and demographic data (ADA, 2024). For instance, companies can use data to recommend products that align with a customer’s past purchases or browsing history, enhancing the relevance of their offerings (Google Cloud, n.d.). Netflix is a well-known example of a company utilising big data to their advantage, as they have created a powerful machine learning-powered recommendation engine. Netflix increased customer satisfaction by personalising content recommendations for each individual based on an analysis of user data, including watching patterns, ratings, and preferences (Medium, 2023). Consequently, the integration of big data into improved customer experience can lead to increased engagement and higher retention rates, driving long-term success for businesses (Medium, 2023).

Big data also improves product development and innovation, providing insights into market trends, customer needs, and market gaps. Companies can enhance marketing strategies by analysing information from social media, customer reviews, and industry reports. With innovation driven from such data, businesses can maintain customer needs and stay relevant in a constantly changing market (Williams, 2024). One company that can be seen using these strategies is Starbucks. Starbucks uses big data through the rewards programme on the mobile app to determine which products should be expanded into supermarkets. This demonstrates how the personal information of millions of users can help direct the future items within the industry (Marr, 2018). Using the collected demographics, behaviour, and preferences, companies can develop strategies that will lead to an improved return on investment with higher conversion rates (Marr, 2018). Furthermore, by identifying unmet needs and shifts in the market, businesses can identify opportunities for new products or services. Not only can companies improve on present conflicts, but big data also allows the anticipation of potential issues through predictive analytics. This results in efficient adjustments and the prevention of possible obstacles in the future. 

The fourth main advantage of big data is improved risk management and fraud detection. Big data can lead to the identification of certain patterns and anomalies that indicate potential risks. Fraud and security breaches are better to be identified early, so less consequences are faced by the company. In order to avoid these conflicts, businesses can take proactive measures to alleviate financial losses and protect the companies reputation (Williams, 2024). These patterns and historical data will help ensure a stable continuation of the company. Unlike traditional systems, big data allows companies to inspect large data sources such as transaction histories, customer behaviours, call analytics, and market intelligence. Accurate flagging indicates fraudulent activities and can even be indicated by personalised detection. Based on consumer information, it can indicate which actions do not follow a certain individual’s normal behaviour. In other words, the target audience can be notified when suspicious activity has occurred on their accounts. For instance, one of these companies is the bank J.P. Morgan Chase, which uses big data with a variety of transaction types to determine some of these abnormal activities (J.P. Morgan Chase & Co, 2024). This strategy helps minimise customer dissatisfaction and upholds the representation of the business.

In conclusion, big data and BDA is a crucial factor for organisations and businesses, offering key advantages in decision making, customer experience, product development, and risk management. The ability to analyse vast amounts of data enables businesses to uncover valuable insights, optimise operations, and stay ahead of market trends, which empowers companies towards a path of sustained success in a competitive landscape. However, despite the wide variety of benefits for businesses and beyond, the current collection and application of big data or BDA is not always done with ethical guidelines or restrictions in mind.

2. Ethical Considerations in Big Data Analytics

This section explores the key ethical issues, including privacy concerns related to the collection and use of personal data, the need for accuracy and transparency in data handling and analysis, challenges surrounding data accessibility and ownership, and the potential for bias and unfairness in algorithmic decision making. Understanding these issues is crucial for developing strategies that mitigate risks, protect individual rights, and ensure that big data analytics is applied in a fair and ethical manner.

2.1 Privacy

As stated in the introductory section, big data has transformed several industries. However, the ethical issue of ensuring privacy is raised repeatedly and remains particularly significant. Data privacy refers to the practice of ensuring that personal information, collected and stored by businesses, is safe from misuse and unauthorised access (Nasikanov, 2024). Privacy advocates argue that as the data ecosystem develops, power dynamics between the government, businesses, and people may be altered, leading to racial or other profiling, discrimination, over-criminalisation, and other restricted freedoms (Polonetsky and Tene, 2013).

Many firms promise that privacy is at the centre of their businesses and that they are careful to never sell information that can be traced back to a person, but researchers studying anonymised location data have shown just how misleading that claim can be (Keegan and Ng, 2021). An anonymised dataset is devoid of any evident identifiers, such as a name, home address, or phone number. Yet external information can be utilised to connect the data to a person if their patterns are unique enough (Montjoye et al., 2013). Additionally, surveillance practices, such as those involved with smart speakers like Amazon’s Alexa, further illustrate privacy concerns. This is because Amazon employs thousands of workers worldwide to listen to and transcribe voice recordings captured by Alexa devices. Even though these human reviews are crucial for refining the algorithms that power smart speakers, especially when it comes to understanding different languages and accents, the process still raises privacy concerns, as it involves humans listening to personal conversations within a person’s home (Day et al., 2019).

Therefore, preserving privacy in BDA is a complex but essential task. One method of preserving privacy is differential privacy, a framework for ensuring the privacy of individuals in datasets (Devaux, 2024). It can provide a strong guarantee of privacy, allowing analysts to examine data without revealing sensitive information about any individual in the dataset (Devaux, 2024). An example of the use of differential privacy is Apple. It collects data about how users interact with their devices, such as which features are often used (Devaux, 2024). Differential privacy at Apple involves adding slight, biased statistical noise to users’ data on their devices before it is shared with the company (Apple, n.d.). This process ensures that Apple cannot reconstruct the original data, protecting individual privacy. When aggregated across many users, the noise averages out, allowing Apple to extract meaningful insights without compromising personal information (Apple, n.d.).

2.1.1 Consent in Privacy

Before privacy is maintained in post-collection processing through techniques such as differential privacy, allowing users to give full consent is vital in preserving the right to privacy. Data collection has become increasingly prominent throughout all fields and addressing consent within the realm of big data has emerged as a critical ethical and legal issue. Consent is a major cornerstone in privacy because it enables individuals to make informed decisions about how their data is used. In practice, however, obtaining meaningful consent is a difficult challenge to tackle (Usercentrics, 2024).

The primary difficulty lies in the complexity of data ecosystems (Perez, 2020). Users are unaware of how their data is collected, aggregated, and analysed across multiple platforms. This makes it difficult to provide truly informed consent. Privacy policies are also often written in complex legal language that makes it difficult for most individuals to fully understand the implications of their consent (Office of the Privacy Commissioner of Canada, 2016). This creates situations where consent is given even though a clear understanding of the potential risks involved is not present (Cate and Mayer-Schönberger, 2013).

While consent is a foundation for upholding privacy, the complexity of data ecosystems and the lack of transparency in privacy policies significantly undermine its effectiveness. Relying on methods such as differential privacy requires data to be accurate and of high quality, in order to be valuable for decision making. In data collection, users often struggle to understand the full implications of their consent, particularly when faced with bundled consent practices. This highlights the need for organisational transparency when it comes to data collection and processing methods, so that user consent is informed. As big data continues to grow and include itself in every aspect of modern life, these challenges must be addressed to ensure that consent can remain a genuine tool for protecting an individual’s privacy rather than turning into a mere procedural formality.

2.2 Accuracy and Transparency

As stated in the introduction, big data is most commonly characterised by the 4Vs. The bigger volume, velocity and variety (of data formats) in big data analytics (BDA) also worsen the risk for error and impact of inaccuracies – especially if multiple sources are aggregated together (Hariri et al., 2019; Richardson et al., 2021). As BDA is increasingly used for decision making, accuracy-related issues are highly notable for both researchers and the organisations or businesses that employ BDA. Transparency of datasets and processing methods for BDA pairs with data accuracy, as greater transparency builds trust and reliability for both users and developers within organisations, and proves to users that their privacy is sufficiently protected.

Big data is seen as “low quality” if it is outdated, incomplete, contains incorrect data or is poorly selected (European Union Agency for Fundamental Rights, 2018, p.5). There are various assessment criteria that can be utilised by both researchers and organisations that employ BDA, which can be summarised as: whether the source of data is reliable; how errors and bias can be identified and reported; what the threshold for inaccuracy should be; how BDA tools can be tested for continual feedback on accuracy; and where the responsibility of accuracy lies (Richardson et al., 2021). Unfortunately, these criteria are not often considered before datasets are selected for BDA tools.

Accuracy is especially important in BDA because of sensitive information in domains such as finance and/or healthcare. This is because major consequences can occur with algorithmic bias (Ikezuruora, 2023). For instance, within the healthcare sector, the more accurate the data, the better the data pertains to an accurate diagnosis and treatment plan, which is crucial to the patient’s health, because it is based on their own personal information (Jorgovan, n.d.). To prevent the consequences of inaccurate data consequences, advanced algorithms can be used to identify and correct errors in big data. One situation regarding the financial domain was the Equifax data breach in 2017, where sensitive information such as social security numbers and addresses were exposed due to a breach in the application framework (Fruhlinger, 2020). 

Estimations state that poor data quality can cost the US economy around $3.1 trillion per year (IBM, 2016 in Hariri et al., 2019, p.5), and can cause failure in service provision or negatively impact the reputation of the given organisation (UK Office for National Statistics, 2021). The consequences of low quality or inaccurate data can also link closely to ethical concerns about fairness. An example of this is the big data used to train facial recognition models, where there is insufficient representation of faces from different races (Lohr, 2018). One study found that the technology only has an error rate of 0.8% for lighter-skinned men but the rate increases to 34.7% for darker-skinned women (Hardesty, 2018). However, the study was not able to access the training data used and how the data was collected, highlighting the need for transparency in resolving these issues (Buolamwini, 2023).

Big data transparency is also important in ensuring success in any large company. Greater transparency enables the clarity required by stakeholders to better understand which transactions are being performed and which decisions to make (Ikezuruora, 2023). It helps companies move forward with making the right decision as a broader audience is aware of what is happening in the company. Better transparency also makes operations safer and more reliable, and it becomes easier to track any errors in the analytics process (Ikezuruora, 2023).

Transparency about big data develops trust within the company; once this trust is built up among stakeholders, a foundation for credibility can be established, which strengthens the organisation and its reliability (Morey et al., 2015). When such a large crowd of people are involved in company operations (those within and users), one mistake can lead to a conundrum of leaks and liability issues. For instance, in the Facebook Cambridge Analytica Scandal of 2018, sensitive information about millions of users was shared without their consent, resulting in large public scrutiny and legal issues (Confessore, 2018). Having strong transparency prevents large backlash if this sort of situation is to happen – being open about operations and what is processed with big data strengthens the relationship between the company and stakeholders, which can help to avoid large conflict, and creates an ethical reputation (Ikezuruora, 2023).

Companies such as Google have developed the integration of increased transparency in personal data. This improvisation was incorporated through the Google Dashboard, allowing users to sort through and view the personal data Google has collected from them (Google, 2009). In this way, users are granted direct access to their generated data and can make better-informed choices related to privacy on their Google account, relating to the previous section. This company demonstrates a path in which other businesses should follow, where data collectors become more transparent in how information is shared, collected, or used.

Datasets used in BDA may be inaccurate, outdated, or incomplete, which can have harmful impacts on decision making, particularly with sensitive information. Ensuring data quality and accuracy means assessing the sources and sampling methods, while maintaining transparency about what is collected and how. Transparency allows datasets to be examined for quality control, to maintain reputation and public trust in BDA use, by allowing users and third parties more direct access to, and possibly ownership of, collected data. Ensuring full transparency in data collection practices also enhances the quality of informed consent and strengthens the regulation of privacy protections. This generates success and greater innovation towards data protection in organisations, by permitting all stakeholders to have greater trust and knowledge in how BDA is applied.

2.3 Accessibility and Data Ownership

Organisational transparency should include increased data access for all, particularly users or researchers, in order to assess data accuracy and privacy. Access to big data is closely linked to the problem of information inequality, where some groups have more access to information than others, worsening existing inequalities (Li & Wu, 2023). In today’s world, it is important to share and use information fairly to prevent certain groups from having too much control. However, differences in technology and economic development, both between and within countries, create large gaps in access to information. These gaps are especially noticeable between rich and poor regions, and between cities and rural areas (Li & Wu, 2023). To address these issues, it’s essential to ensure that everyone has fair access to big data.

Big data accessibility is a significant issue in the digital age. While open data initiatives aim to make data freely available for use and distribution, most big data, especially from tech corporations, remains inaccessible to the public (Richterich, 2018). This inaccessibility is often justified by concerns over privacy and corporate competitiveness. However, it creates a “big data divide”, where only those within corporations or affiliated research projects can access and benefit from this valuable information (Richterich, 2018, p.9). This divide limits the ability of external individuals, such as independent researchers, to analyse and critique the data, reinforcing power imbalances in society (Richterich, 2018, p.9).

As stated in the introduction, big data has the potential to transform various fields, for instance, education, but its accessibility remains a significant challenge. For big data to be truly effective, it should be available to those who need it – educators, policymakers, and researchers. However, issues such as data ownership, privacy laws, and technological barriers often restrict access (Veldkamp et al., 2021). For instance, data may be siloed within different organisations, making it difficult to connect and analyse. Additionally, strict privacy regulations, while necessary, can limit the availability of data for educational research (Veldkamp et al., 2021). Even when data is accessible, it is often in formats that are not easily usable, requiring significant time and resources to process. These barriers highlight the need for improved data accessibility to ensure that big data can be leveraged to enhance educational outcomes (Veldkamp et al., 2021).

However, alongside the need for greater accessibility, the question of data ownership emerges as a critical concern. With respect to the big data era, the vast amount of data being created, collected and analysed has caused data ownership to become a pivotal issue in our society. Data ownership does not completely revolve around the possession of data, but is also inclusive of the concerns regarding control, access, and the right to use the data (Hummel et al., 2020). The proliferation of the use of big data by organisations for making data-driven decisions intensifies the potential consequences of ethical, legal and operational issues. Hummel et al. (2020) highlight the idea that the rise of data-driven industries has intensified debates over who truly owns data – the individuals who generate it, the entities that collect and process it, or the combined community or society. Unlike physical property, data is generated by individuals but often controlled and monetised by corporations, raising significant questions about who truly owns this information (Bird & Bird LLP, 2019).

Current legal frameworks primarily focus on data protection and privacy rather than clearly defining data ownership (Bird & Bird LLP, 2019). This legal ambiguity leaves individuals with limited control over their data once it is collected. Terms of service agreements are often lengthy and complex, leading users to unknowingly cede control over their personal data to companies. This situation creates a power imbalance where corporations benefit from the commercial use of personal data while individuals are excluded from the economic value generated (Bird & Bird LLP, 2019).

Ethical considerations further complicate the issue of data ownership. Hummel, Braun, and Dabrock (2020) argue that data should be viewed as a personal asset, akin to traditional forms of property. By recognising data ownership rights, individuals could gain greater control over how their data is used, shared, and monetised. This approach aims to address concerns about transparency and consent, providing a more equitable framework for data governance. Such frameworks can enhance individual empowerment and ensure that the benefits of data-driven innovations are more fairly distributed (Hummel, Braun, and Dabrock, 2020).

Implementing these new governance models could rebalance the relationship between data subjects and data controllers. By balancing data access inequality, issues such as algorithmic bias can be better tackled as more groups are represented fully within big data. By granting individuals ownership rights over their data, society could move towards a more transparent and fair system, where data is not merely a resource for corporate profit but a personal asset that individuals can manage and benefit from.

2.4 Bias and Fairness

The use of BDA in decision-making algorithms has become increasingly entwined in the daily lives of individuals. Processes and services such as loan acceptance, granting parole, and hiring processes, which directly affect future opportunities and quality of life, rely more and more on the results of automated data analysis (Favaretto et al., 2019). However, these algorithms and their training data are often not easily accessible and transparent, which hides the biases and lack of diversity inside the data. Several studies (Favaretto et al., 2019; Prescott, 2023) argue that algorithmic profiling, scoring, classification, and predictive systems powered by big data, as well as big data itself, are commonly understood as accurate, objective tools capable of overcoming the social and cultural biases of human judgement. Controversially, Anderson (2008) suggests the idea that BDA, due to the sheer volume of information being handled, would naturally give the “correct” or optimal result, without the need for human input or understanding.

More and more research is revealing that this idealised view of BDA’s abilities is not only clearly false, but also dangerous (European Union Agency for Fundamental Rights, 2018). The “illusion of objectivity” associated with machines and systems (Lohmann, 2019, para.2) hides the fact that big data reflects human behaviour, and so algorithms are distinctly human creations. The decision making required in sampling and preparing training data, testing algorithms, deploying them, and ultimately drawing meaning from their outputs is a human responsibility (Crawford, 2013; Heilweil, 2020). Therefore, BDA and algorithmic processing are imbued with the many prejudices, bias, and gaps in knowledge of their developers, and thus may reinforce or even institutionalise discrimination and marginalisation of communities (Lohmann, 2019; Richardson et al., 2021).

Bias in BDA can be introduced early on, in which datasets are chosen or which data sources to collect from for training data. Training data is labelled by developers, and the algorithm is taught whether its classifications or predictions are right or wrong based on said labels to improve its “accuracy” and become fit for analysis (Heilweil, 2020, para. 6). If the training datasets used are systematically prejudiced or contaminated, the algorithm will learn to reproduce this prejudice in future outputs and perpetuate discrimination against certain groups (European Union Agency for Fundamental Rights, 2018). The chosen sources of big data for training may be inherently biased, for example if there are systematic differences in how data is collected for different groups historically or the data is derived from an institution with systemic prejudice (Favaretto et al., 2019; Heilweil, 2020). It is also possible that data may be incomplete and lacking in diversity, which relates to the concern of data accuracy.

The example of Amazon’s automated recruitment engine clearly highlights how real-life discrimination becomes embedded and reflected in BDA tools. Since 2014, Amazon software developers have built an artificial intelligence (AI) program to scan resumes and rate applicants for software-related roles, using resumes submitted to the company over ten years (Dastin, 2018). By 2015, the model had learnt to heavily penalise female candidates and any terms containing “women” or “female” on resumes. This was because the majority of resumes in the training data were from men, as the technology industry is largely male-dominated, and thus the algorithm prioritised this pattern in its decisions (Dastin, 2018).

While BDA tools are now commonly used by various organisations to facilitate or even automate decisions and innovations, sets of big data can often hide bias and prejudice, or be unrepresentative of certain groups. This can directly lead to BDA tools, such as algorithms or artificial intelligence, that reinforce or worsen discriminatory practices. Unless the human responsibility of regulating big data to be fair and diverse is emphasised, the continual use of these models and algorithms involved in sensitive areas such as loans or criminal justice will only lead to a future society with greater inequalities and systemic discrimination. Considering core principles, such as privacy protection, transparency, accuracy, accessibility, and fairness, in BDA is vital for developing and applying future big data technologies positively and ethically.

3. Future of Big Data Analytics

The outlook on the future is always shifting as new areas of research are uncovered and as the field of big data develops. Several significant developments are shaping the direction of big data as businesses work to exploit the enormous volumes of data that users contribute. Changes such as Machine Learning (ML) with Artificial Intelligence (AI) and advancements in quantum computing grow alongside apprehensions over moral and legal issues in the field, which all affect the future of big data technology.

3.1 Artificial Intelligence and Machine Learning

Big data analytics might continue to be based on AI and ML in the future. AI-powered systems are capable of analysing enormous volumes of data and creating intricate understanding patterns. This makes complicated forecasts conceivable that people would not be able to make otherwise (IABAC, 2024).

Beyond processing structured data, AI is already extremely sophisticated, being able to handle and process unstructured data such as images, videos, and text (Forbes, 2024). This has already opened new possibilities for industries to rely on multimedia data. Healthcare, for example, could train an AI on patient data to identify patterns that lead to earlier diagnoses and more effective treatments (Dave and Patel, 2023). Current attempts at using big data in such a way have demonstrated promising results for disease prediction and outcome forecasting (Esteva et al., 2017).

3.2 Progress in Quantum Computing

The future of BDA does not only lie with big data-processing algorithms, which are already heavily used. Quantum computing is another technology that might potentially impact big data in important ways. Traditional computers use bits to store and calculate information. These bits have two states, 0 or 1. Quantum computers use Qubits which can represent both states of a bit at the same time due to quantum superposition (IBM, 2024).

Quantum computing is still in its early stages, but it has the potential to significantly impact big data analytics. Once quantum computing is fully developed, it may be able to solve optimisation problems that are currently too complex for computers to handle. This technology has the potential to greatly improve the accuracy and efficiency of risk assessments, portfolio optimisations, and drug development processes in industries like finance and pharmaceuticals (Arute et al., 2019). Furthermore, quantum-enhanced ML can accelerate model training by enabling previously unthinkable speed and scale in the extraction of insights from data (Biamonte et al., 2017).

3.3 Edge Computing and Real-Time Data Processing

Another newer technology that could aid in the processing of big data is edge computing. Edge computing is a distributed computing model that brings computation and data storage closer to the source of data. More broadly, it refers to any design that pushes computation physically closer to a user to reduce latency (Wikipedia, 2024).

Edge computing proves particularly advantageous in scenarios where split-second decisions are critical. Unlike cloud computing, edge computing significantly reduces latency, making it ideal for rapid data processing tasks, such as those in autonomous vehicles. As Satyanarayanan (2017) suggests, edge computing effectively complements cloud technology by offering low-latency, location-aware services necessary for time-sensitive applications, highlighting its growing importance in the future of computing.

4. Conclusion

In conclusion, the rise of big data, defined by the 4Vs — Volume, Velocity, Variety, and Veracity — has fundamentally transformed data management and decision making across industries (Gomathy, 2023). As organisations leverage big data to enhance predictions, optimise operations, and gain competitive advantages, its impact on sectors such as business, healthcare, and government continues to grow, driving innovation and improving outcomes. The benefits of big data are numerous, including enhanced decision making, improved product development and innovation, better customer experience, and superior risk management and fraud detection (Salari et al., 2022).

However, the increase in big data also raises significant ethical concerns, particularly around privacy, consent, and data ownership. Users often lack awareness of how their data is collected, aggregated, and analysed across multiple platforms, posing challenges to the preservation of privacy. Differential privacy has emerged as a potential solution to safeguard individual privacy within datasets, yet the complexity of data ecosystems remains a barrier.

Data accuracy and transparency are critical for ensuring reliable information and fostering trust within organisations. Inaccurate data can lead to faulty predictions and wasted resources, while transparency builds credibility and strengthens organisational reliability. Furthermore, issues of accessibility and ownership exacerbate information inequality, with much big data remaining inaccessible due to privacy concerns and corporate interests. The legal ambiguity surrounding data ownership leaves individuals with limited control over their data, often monetised by corporations.

Bias and fairness are also central concerns in the data analytics lifecycle, with bias potentially skewing results at any stage and fairness being crucial in testing processes. 

As the future of big data is shaped by advancements in AI, quantum computing, and edge computing, ethical considerations must evolve to ensure responsible and equitable use of big data. Addressing these challenges will require the development of equitable governance models that balance power dynamics and promote fairness, transparency, and accountability in the era of big data.

Bibliography

Abdalla, H.B. (2022). A brief survey on big data: technologies, terminologies and data-intensive applications. Journal of Big Data, [online] 9(1). doi: https://doi.org/10.1186/s40537-022-00659-3. [Accessed 22 Aug. 2024].

ADA (2024) 5 Ways Big Data Improves Customer Experience. Available at: https://www.ada-asia.com/insights/big-data-improves-customer-experience [Accessed: 20th August 2024].

Angwin, J. et al. (2016) ‘Machine bias’, ProPublica. Available at: https://www.propublica.org/article/machine-bias-risk-assessments-in-criminal-sentencing [Accessed on 22 August 2024].

Anonos (2024) What is Differential Privacy: Definition, Mechanisms, and Examples. Available at: https://www.anonos.com/blog/what-is-differential-privacy-definition-mechanisms-examples [Accessed: 21st August 2024].

Apple (n.d.) Differential Privacy. Available at: https://www.apple.com/privacy/docs/Differential_Privacy_Overview.pdf [Accessed: 24th August 2024].

Arute, F., Arya, K., Babbush, R., Bacon, D., Bardin, J.C., Barends, R., … & Neven, H. (2019) ‘Quantum supremacy using a programmable superconducting processor’, Nature, 574(7779), pp.505-510. doi: https://doi.org/10.1038/s41586-019-1666-5 [Accessed on 19 August 2024].

Bamiah, M.A., Brohi, S.N., Rad, B.B. (2018) ‘Big Data Technology in Education: Advantages, Implementations, and Challenges’, Journal of Engineering Science and Technology, 13(Special Issue on ICCSIT 2018), 229-241. Available at: https://jestec.taylors.edu.my/Special%20Issue%20ICCSIT%202018/ICCSIT18_19.pdf [Accessed: 20th August 2024].

Bergmann, C. and Lohmann, M. (2019) ‘Are algorithms objective? no, that’s an illusion., Deutsche Telekom.’, Deutsche Telekom. Available at: https://www.telekom.com/en/company/digital-responsibility/details/are-algorithms-objective-an-illusion-575054#:~:text=Are%20algorithms%20objective%3F%20No%2C%20that%E2%80%99s%20an%20 illusion.%20 Robots,to%20be%20performed.%20They%20provide%20computers%20with%20instructions [Accessed on 22 August 2024].

Biamonte, J., Wittek, P., Pancotti, N., Rebentrost, P., Wiebe, N., & Lloyd, S. (2017) ‘Quantum machine learning’, Nature, 549(7671), pp.195-202. doi: ​​https://doi.org/10.1038/nature23474 [Accessed on 28 August 2024].

Bowers, K.M., Fredericks, E.M. and Hariri, R.H. (2019) ‘Uncertainty in big data analytics: Survey, opportunities, and challenges’, Journal of Big Data, 6(1).

Buolamwini, J. (2024) Unmasking the bias in facial recognition algorithms. MIT Sloan. Available at: https://mitsloan.mit.edu/ideas-made-to-matter/unmasking-bias-facial-recognition-algorithms [Accessed: 30 August, 2024].

Cambridge Dictionary (2019). DATA | meaning in the Cambridge English Dictionary. [online] Cambridge.org. Available at: https://dictionary.cambridge.org/dictionary/english/data. [Accessed 22 Aug. 2024].

Carter, M., Petter, S. and Richardson, S.M., (2021) Five ethical issues in the Big Data Analytics Age, AIS Electronic Library (AISeL). Available at: https://aisel.aisnet.org/cais/vol49/iss1/18/ (Accessed: 22 August, 2024).

Chen, H., Chiang, R.H., & Storey, V.C. (2012) ‘Business Intelligence and Analytics: From Big Data to Big Impact’, MIS Quarterly, 36(4), pp.1165-1188. Available at: https://www.researchgate.net/publication/284679162_Business_Intelligence_and_Analytics_From_Big_Data_to_Big_Impact [Accessed on 30 August 2024].

Colin Biggers & Paisley (2019) What is real consent to data collection? Available at: https://www.cbp.com.au/insights/insights/2019/november/what-is-real-consent-to-data-collection [Accessed: 29th August 2024].

Confessore (n.d.) ‘Cambridge Analytica and Facebook: The Scandal and the Fallout So Far’, The New York Times. Available at: https://www.nytimes.com/2018/04/04/us/politics/cambridge-analytica-scandal-fallout.html [Accessed: August 31, 2024].

Crawford, K. (2021) ‘The hidden biases in big data.’, Harvard Business Review. Available at: https://hbr.org/2013/04/the-hidden-biases-in-big-data [Accessed on 21 August 2024].

Dave, M. and Patel, N. (2023) ‘Artificial Intelligence in Healthcare and Education’, British Dental Journal, 234(10), pp.761-764. doi: https://doi.org/10.1038/s41415-023-5845-2 [Accessed on 20 August 2024].

De Montjoye, Y.A., Hidalgo, C.A., Verleysen, M., Blondel, V.D. (2013) ‘Unique in the Crowd: The privacy bounds of human mobility’, Scientific Reports, 3(1376), 1. Available at: https://www.nature.com/articles/srep01376#citeas [Accessed: 21st August 2024].

Esteva, A., Kuprel, B., Novoa, R. et al. Dermatologist-level classification of skin cancer with deep neural networks. Nature 542, 115–118 (2017). doi: https://doi.org/10.1038/nature21056 [Accessed on 20 August 2024].

European Union Agency for Fundamental Rights (2018). ‘BigData: Discrimination in data-supported decision making’, FRA Focus. Available at: https://fra.europa.eu/en/publication/2018/bigdata-discrimination-data-supported-decision-making [Accessed: 22 August, 2024].

Favaretto, M., De Clercq, E. and Elger, B.S. (2019) ‘Big Data and discrimination: Perils, promises and solutions. A systematic review’, Journal of Big Data, 6(1). doi: https://doi.org/10.1186/s40537-019-0177-4 [Accessed on 21 August 2024].

Forbath, T., Morey, T., and Schoop, A. (2015) Customer data: Designing for transparency and trust. Harvard business review. Available at: https://hbr.org/2015/05/customer-data-designing-for-transparency-and-trust [Accessed: August 31, 2024].

Forbes (2024) ‘How AI Can Unlock The Power Of Unstructured Data’. Available at: https://www.forbes.com/councils/forbestechcouncil/2024/05/24/how-ai-can-unlock-the-power-of-unstructured-data/ [Accessed on 30 August 2024].

Fruhlinger, J. ( 2020). ‘Equifax data breach FAQ: What happened, who was affected, what was the impact’, CSO. Available at: https://www.csoonline.com/article/567833/equifax-data-breach-faq-what-happened-who-was-affected-what-was-the-impact.html [Accessed: August 31, 2024].

G., U. (2020). Transition and Evolution of Data into Big Data in the 21st Century. Bulletin Monumental, 21(12), pp.193–203. [Accessed 22 Aug. 2024].

Gomathy, C.K. (2023). A Survey on Big Data Analytics: Challenges, Open Research Issues and Tools. International Journal of Scientific Research in Engineering and Management (IJSREM), 06(12). doi: https://doi.org/10.55041/IJSREM16959. [Accessed 22 Aug. 2024].

Google (2009) Transparency, choice and control — now complete with a Dashboard! Available at: https://googleblog.blogspot.com/2009/11/transparency-choice-and-control-now.html#:~:text=In%20an%20effort%20to%20provide,to%20control%20your%20personal%20settings. [Accessed on 31st Aug 2024].

Google Cloud (n.d.) What is Big Data. Available at: https://cloud.google.com/learn/what-is-big-data#big-data-benefits [Accessed: 20th August 2024].

Hardesty, L. (2018) Study finds gender and skin-type bias in commercial artificial-intelligence systems. MIT News. Available at: https://news.mit.edu/2018/study-finds-gender-skin-type-bias-artificial-intelligence-systems-0212 [Accessed: 30 August, 2024].

Heilweil, R. (2020) ‘Why algorithms can be racist and sexist’, Vox. Available at: https://www.vox.com/recode/2020/2/18/21121286/algorithms-bias-discrimination-facial-recognition-transparency [Accessed on 21 August 2024].

IABAC (2024) ‘The Future of Data Analytics: AI and Machine Learning Trends’. Available at: https://iabac.org/blog/the-future-of-data-analytics-ai-and-machine-learning-trends [Accessed on 30 August 2024].

IBM (2024) ‘Quantum Computing’. Available at: https://www.ibm.com/topics/quantum-computing [Accessed on 30 August 2024].

Improvado (2024) 10 Big Data Analytics Privacy Problems and How to Navigate Them. Available at: https://improvado.io/blog/big-data-analytics-privacy-problems#:~:text=Data%20privacy%2C%20often%20interchangeably%20used,from%20misuse%20and%20unauthorized%20access [Accessed: 24th August 2024].

Institute of Data (2023). Why Big Data is Important: Exploring Its Benefits and Uses | Institute of Data. [online] Institute of Data. Available at: https://www.institutedata.com/sg/blog/why-big-data-is-important/ [Accessed 22 Aug. 2024].

Ikezuruora, C. (2023). Unveiling the Truth: Why Transparency Is Essential in the Age of Big Data. PrivacyEnd. Available at: https://www.privacyend.com/transparency-essential-age-big-data/ (Accessed: 24 August, 2024).

Jorgovan, J. (n.d.) Big Data Analytics: Transforming decision-making in healthcare businesses. Available at: https://jake-jorgovan.com/blog/big-data-analytics-transforming-decision-making-in-healthcare-businesses (Accessed: August 31, 2024).

JPMorgan Chase & Co. (no date) Jpmorgan.com. Available at: https://www.jpmorgan.com/technology/artificial-intelligence/initiatives/synthetic-data/payments-data-for-fraud-detection [Accessed: August 31, 2024].

Lohr, S. (2018) Facial Recognition Is Accurate, if You’re a White Guy. The New York Times. Available at: https://www.nytimes.com/2018/02/09/technology/facial-recognition-race-artificial-intelligence.html [Accessed: 30 August, 2024].

Manyika, J., et al. (2011) Big Data: The Next Frontier for Innovation, Competition, and Productivity. McKinsey Global Institute. Available at: https://www.mckinsey.com/capabilities/mckinsey-digital/our-insights/big-data-the-next-frontier-for-innovation [Accessed on 30 August 2024].

Marr, B. (2018) Starbucks: Using big data, analytics and artificial intelligence to boost performance, Forbes. Available at: https://www.forbes.com/sites/bernardmarr/2018/05/28/starbucks-using-big-data-analytics-and-artificial-intelligence-to-boost-performance/ [Accessed: August 31, 2024].

Medium (2023) How Did Netflix Use Big Data to Transform Their Company and Dominate the Streaming Industry? Available at: https://vivekjadhavr.medium.com/how-did-netflix-use-big-data-to-transform-their-company-and-dominate-the-streaming-industry [Accessed: 20th August 2024].

Office of the Privacy Commissioner of Canada (2016) Consent and privacy – A discussion paper exploring potential enhancements to consent under the Personal Information Protection and Electronic Documents Act. Available at: https://www.priv.gc.ca/en/opc-actions-and-decisions/research/explore-privacy-research/2016/consent_201605/ [Accessed: 29th August 2024].

Perez, E. M. A. (2020) How to manage complexity and realize the value of big data. IBM. Available at: https://www.ibm.com/think/insights/how-to-manage-complexity-and-realize-the-value-of-big-data [Accessed: 29th August 2024].

Polonetsky, J., Tene, O. (2013) ‘Privacy and Big Data: Making Ends Meet’, Stanford Law Review, 66(25), 25. Available at: https://heinonline.org/HOL/LandingPage?handle=hein.journals/slro66&div=5&id=&page= [Accessed: 21st August 2024].

Prescott, A. (2023) ‘Bias in big data, Machine Learning and AI: What lessons for the digital humanities?’, Digital Humanities Quarterly, 17 (2). Available at: https://digitalhumanities.org/dhq/vol/17/2/000689/000689.html [Accessed on 21 August 2024].

Raghupathi, W., & Raghupathi, V. (2014) ‘Big Data Analytics in Healthcare: Promise and Potential’, Health Information Science and Systems, 2(3). Available at: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4341817/ [Accessed on 30 August 2024].

Reuters (n.d.) “Big Data ethics: Redefining values in the digital world.” Available at: https://legal.thomsonreuters.com/en/insights/articles/big-data-ethics-redefining-values-in-the-digital-world (Accessed: August 24, 2024).

Richardson, S.M., Petter, S. and Carter, M. (2021) ‘Five ethical issues in the Big Data Analytics Age’, AIS Electronic Library, 49 (18). Available at: https://aisel.aisnet.org/cais/vol49/iss1/18/ [Accessed on 22 August 2024].

Salari, O., Haghgoo, S., Salari, A.H., Kovaleva, E.A. (2022) ‘The Role Of Big Data In Business And Decision Making’, Innovative Scientific Research, 4-1(18), 84. Available at: https://www.researchgate.net/publication/361529546_THE_ROLE_OF_BIG_DATA_IN_BUSINESS_AND_DECISION_MAKING [Accessed: 20th August 2024].

Satyanarayanan, M. (2017) ‘The emergence of edge computing’, Computer, 50(1), pp.30-39. doi: https://doi.org/10.1109/MC.2017.9 [Accessed on 19 August 2024].

Solove, D. J. (2013) Big Data Ethics. Available at: https://papers.ssrn.com/sol3/papers.cfm?abstract_id=2171018 [Accessed: 29th August 2024].

Tang, D., Gupta, S. and Tarabishy, A. (2023) ‘Artificial intelligence in retail: Drivers, applications, and challenges’, Business Horizons, 66(1), pp.87-99. Available at: https://ideas.repec.org/a/eee/bushor/v66y2023i1p87-99.html [Accessed on 30 August 2024].

The Markup (2021) There’s a Multibillion-Dollar Market for Your Phone’s Location Data. Available at: https://themarkup.org/privacy/2021/09/30/theres-a-multibillion-dollar-market-for-your-phones-location-data [Accessed: 21st August 2024].

Time (2019) Thousands of Amazon Workers Listen to Alexa Users’ Conversations. Available at: https://time.com/5568815/amazon-workers-listen-to-alexa/ [Accessed: 21st August 2024].

Usercentrics (2024) Everything You Need to Know About Consent Management. Available at: https://usercentrics.com/knowledge-hub/consent-management/ [Accessed: 29th August 2024].

Wikipedia (2024) ‘Edge Computing’. Available at: https://en.wikipedia.org/wiki/Edge_computing [Accessed: 30 August 2024].

Williams, M. (2024).Leveraging Big Data Analytics for Strategic Business Growth in 2024 Available at: https://www.linkedin.com/pulse/leveraging-big-data-analytics-strategic-business-growth-mark-williams-mykzf#:~:text=Big%20data%20provides%20insights%20that [Accessed 23 Aug. 2024].