Evaluating the Effectiveness of Prompt Engineering in Salesforce Prompt Studio

Authors

  • Shalini Polamarasetti Independent Researcher. Author

DOI:

https://doi.org/10.63282/3050-9246.IJETCSIT-V2I3P111

Keywords:

Prompt Engineering, Salesforce Prompt Studio, Generative AI, Large Language Models (Llms), AI Optimization, Natural Language Processing (NLP), Prompt Design Strategies, AI Productivity Tools, Context-Aware Responses, Workflow Automation

Abstract

The appearance of Large Language Models (LLMs) changed the environment of enterprise applications and allowed natural language to be used to automate processes, communicate to the customer, and analyze the data. In this ecosystem, timely engineering has become one of the most efficient definitions of the output quality, particularly in those platforms incorporating instructions written by users to produce text, such as Salesforce Prompt Studio. The present paper explores the effect of various prompt design approaches, i.e. few-shot prompting, template-based inputs, and contextual primes, on the performance and reliability of AI output on Salesforce Prompt Studio. Several use cases are considered in the study, such as customer service automation, lead scoring and email generation. Creating and comparing the variants of prompts based on such metrics as the factual accuracy, relevance, alignment with the tone, the satisfaction of the users, the paper demonstrates an assessable relationship between the prompt design and the quality of the outcome. The results indicate outstanding practices in timely development and provide suggestions on enterprise-scale optimal prompt engineering. The study is relevant to the emerging market of applied LLMs and supplies prompt optimization into the context of realistic business problems and with the evidence of improved productivity and consistency outcomes in CRM settings

Downloads

Download data is not yet available.

References

[1] A. Radford et al., “Language Models are Few-Shot Learners,” arXiv preprint arXiv:2005.14165, 2020.

[2] T. Brown et al., “Language Models are Few-Shot Learners,” in Proc. NeurIPS, 2020.

[3] C. D. Manning and H. Schütze, “Foundations of Statistical Natural Language Processing,” MIT Press, Cambridge, MA, 1999.

[4] T. Brown et al., “Language Models Are Few-Shot Learners,” Advances in Neural Information Processing Systems (NeurIPS), vol. 33, pp. 1877-1901, 2020.

[5] Salesforce, “Salesforce Einstein GPT,” [Online]. Available: https://www.salesforce.com/products/einstein-gpt/

[6] Salesforce Developers, “Prompt Studio Overview,” [Online]. Available: https://developer.salesforce.com/docs

[7] T. B. Brown et al., “Language Models are Few-Shot Learners,” Advances in Neural Information Processing Systems (NeurIPS), vol. 33, pp. 1877-1901, 2020.

[8] A. Radford, J. Wu, R. Child, D. Luan, D. Amodei, and I. Sutskever, “Language Models are Unsupervised Multitask Learners,” OpenAI, 2019.

[9] A. Liu, “The Art of Prompting: Designing Inputs for Better Outputs in LLMs,” Journal of Artificial Intelligence Research, vol. 63, pp. 234-250, 2020.

[10] D. Hendrycks et al., “Measuring Massive Multitask Language Understanding,” in Proc. ICLR, 2020.

[11] J. Wang et al., “Prompt Programming for Large Language Models: Beyond Few-shot and Zero-shot,” IEEE Transactions on Neural Networks and Learning Systems, vol. 32, no. 12, pp. 5389-5401, 2020.

[12] Y. Zhang and M. Lapata, “Answering in Style: Unsupervised Style Transfer for Question Answering,” in EMNLP, 2019.

[13] N. Houlsby, A. Giurgiu, S. Jastrzębski, B. Morrone, Q. De Laroussilhe, A. Gesmundo, M. Attariyan, and S. Gelly, “Parameter-Efficient Transfer Learning for NLP,” in Proc. ICML, 2019.

[14] J. Howard and S. Ruder, “Universal Language Model Fine-tuning for Text Classification,” arXiv:1801.06146, 2018.

[15] K. Zhong et al., “Adapting Language Models via Prompt Templates: A Case Study on Customer Emails,” in IEEE Big Data, 2019.

[16] J. Dou et al., “Improving LLM Reliability via Prompt Engineering,” in Proc. COLING, 2020.

[17] R. Zhang et al., “Prompt Robustness in Generative Models,” in Proc. AAAI, 2020.

[18] Salesforce Documentation, “Einstein GPT and Prompt Studio,” [Online]. Available: https://help.salesforce.com/

[19] K. Clark et al., “Eliciting Knowledge from Language Models Using Templates,” in Proc. NAACL, 2020.

[20] C. Holtzman et al., “The Curious Case of Neural Text Degeneration,” in Proc. ICLR, 2020.

[21] J. K. Lee et al., “Evaluating Text Generation: Metrics and Human Judgments,” in Proc. ACL, 2020.

[22] M. Fabbri et al., “SumEval: Re-Evaluating Summarization Evaluation,” in Proc. EMNLP, 2019.

[23] H. Papineni et al., “BLEU: a Method for Automatic Evaluation of Machine Translation,” in Proc. ACL, 2002.

[24] A. Bhandari et al., “Why Do You Ask? Using Contextualized Prompts to Improve Language Generation,” in Proc. NAACL, 2020.

[25] P. Rajpurkar et al., “SQuAD: 100,000+ Questions for Machine Comprehension of Text,” in Proc. EMNLP, 2016.

[26] M. Dusek et al., “Evaluating the State-of-the-Art of End-to-End Natural Language Generation: The E2E NLG Challenge,” in Computational Linguistics, vol. 45, no. 3, pp. 420–448, 2019.

[27] S. Narayan et al., “Don’t Give Me the Details, Just the Summary! Topic-Aware Convolutional Neural Networks for Extreme Summarization,” in Proc. EMNLP, 2018.

[28] A. Holgate and F. Jurafsky, “The Role of Prompt Templates in Generative Text Coherence,” Stanford NLP Reports, 2020.

[29] L. Lin et al., “Data-Driven Prompts for CRM Automation,” in Proc. IEEE BigData, 2020.

[30] R. Zhang and C. Xiong, “Investigating Prompt Variants for CRM Summarization,” in IEEE Access, vol. 8, pp. 133041–133051, 2020.

[31] K. Shin et al., “AutoPrompt: Eliciting Knowledge from Language Models with Automatically Generated Prompts,” in Proc. EMNLP, 2020.

[32] Finn, C., Abbeel, P., & Levine, S., “Model-Agnostic Meta-Learning for Fast Adaptation of Deep Networks,” in Proc. ICML, 2017.

[33] A. Mishra and M. Dabre, “Improving Summarization Using Structured Prompt Templates,” in Proc. IEEE BigData, 2020.

[34] N. Reimers and I. Gurevych, “Making Monolingual Sentence Embeddings Multilingual using Knowledge Distillation,” in Proc. EMNLP, 2020.

[35] L. Kennedy et al., “Designing Reliable Prompts for Language Models in Business Automation,” in Proc. IEEE Int’l Conf. on Enterprise Computing, 2020.

[36] S. Reed, Z. Akata, S. Mohan, S. Tenka, B. Schiele, and H. Lee, “Learning What and Where to Draw: Few-Shot Learning with Task-Dependent Example Selection,” in Proc. CVPR, 2018.

[37] T. Schick and H. Schütze, “It’s Not Just Size That Matters: Small Language Models Are Also Few-Shot Learners,” arXiv:2009.07118, 2020.

[38] M. F. Zellers et al., “Defending Against Neural Fake News,” in NeurIPS, 2019.

[39] K. Krishna et al., “Revisiting the Calibration of Modern Neural Networks,” in Proc. NeurIPS, 2020.

[40] S. Ravichander, E. Hovy, L. P. Downs, and Y. Bisk, “Questions Can Be Ambiguous Too: Investigating the Role of Ambiguity in Question Answering,” in Proc. ACL, 2020.

Published

2021-10-30

Issue

Section

Articles

How to Cite

1.
Polamarasetti S. Evaluating the Effectiveness of Prompt Engineering in Salesforce Prompt Studio. IJETCSIT [Internet]. 2021 Oct. 30 [cited 2025 Nov. 14];2(3):96-103. Available from: https://www.ijetcsit.org/index.php/ijetcsit/article/view/474

Similar Articles

31-40 of 362

You may also start an advanced similarity search for this article.