AVP, Solutions Architects
Data management continues to evolve to address new use cases in advanced analytics, machine learning, and artificial intelligence. The past year has seen the rapid emergence of a new discipline called generative AI, of which the most high-profile example is the intelligent chatbot technology called ChatGPT.
Interest in generative AI is strong. In a recent TDWI survey, while less than 20% of respondents are using generative AI now, more than 30% plan to use it in the next year. Large language models (a key type of statistical algorithm used in generative AI) can be used to produce textual outputs, such as document summaries, marketing copy, and program code. LLMs can even be used to generate images, video, and other media outputs in response to textual inputs known as “prompts.” Data scientists, business analysts, and others can even use LLMs to generate SQL and Python code, sample and synthetic data sets, and visualizations, metrics, summary statistics, and explanations of complex relationships within data sets.
Data management professionals know that they must continue to evolve their skills, teams, platforms, and practices to address new opportunities associated with generative AI and other technological innovations. At the same time, enterprises must implement the necessary guardrails to mitigate growing risks to privacy, ethics, security, intellectual property protection, and other areas of concern to enterprises. In this sponsor panel, TDWI senior research director James Kobielus will lead data industry experts in a discussion of these trends, issues, and opportunities. They will explore such issues as:
- When and how should enterprises evaluate the potential of LLM-driven query generation tools as an adjunct to or replacement for existing SQL optimization tools?
- How should enterprises incorporate LLM-driven tools such as synthetic data generation and prompt-driven ETL code generation into their data engineering practices?
- Are vector databases the future of generative AI or is there a future of SQL, NoSQL, and other DBMSs in hybrid multiplatform deployments?
- How should organizations evolve their data lakes to speed the building, training, operationalization, and governance of high-impact generative AI applications?
- What roles, skills, and workflows should organizations adopt in their data governance practices to effectively curate training data, iteratively fine-tuning the outputs using prompt engineering?