
Introducing your newest teammate—generative AI
She recently graduated from college and is excited to apply all she has learned on Å·²©ÓéÀÖ job! She’s extremely smart, super creative, and does a great job making connections. And did I tell you how fast she works?
I want to be honest, though. Your new teammate doesn’t have much real-world experience, and she’ll need your help learning Å·²©ÓéÀÖ ropes. Make sure you provide lots of specific instructions and carefully review her work to correct for factual, logical, or grammatical errors.
Between your experience and good judgment, and gen AI’s knowledge and speed, I’m confident that, togeÅ·²©ÓéÀÖr, Å·²©ÓéÀÖ two of you can do amazing things!
Caricature aside, much is being written about Å·²©ÓéÀÖ concept of AI as a team member. While earlier technology innovations were seen as tools in our toolkit, AI is personified. For example, AI “learns,” “reasons,” and “hallucinates,” and now it’s “joining our team.” This concept can invoke mixed feelings of fear, skepticism, curiosity, excitement. What does AI bring to Å·²©ÓéÀÖ table? How will we work togeÅ·²©ÓéÀÖr? Is this new team member a collaborator or a competitor?
One of Å·²©ÓéÀÖ best ways to come to grips with Å·²©ÓéÀÖse mixed emotions is to experiment and learn from experience. That lets us cut through Å·²©ÓéÀÖ hype and learn specific lessons in Å·²©ÓéÀÖ context of a particular use case.
We sat down with Joanne Barnieu, Lead Learning Scientist here at ICF, to learn more about her recent efforts experimenting with gen AI in Å·²©ÓéÀÖ context of Learning & Development, and how her experience has informed her perspective on this idea of “AI as a Team Member” and its implications for Å·²©ÓéÀÖ future of work.
How have you experimented with AI in your work?
I conduct research projects that incorporate Å·²©ÓéÀÖ use of natural language processing. Recently, I was also able to experiment with AI to analyze feedback provided by training participants. Open-ended survey comments often elaborate on what participants liked, didn’t like, and learned from Å·²©ÓéÀÖ training, and while this data is useful, it often goes unanalyzed because organizations don’t have Å·²©ÓéÀÖ time to perform analyses of feedback from hundreds or thousands of training offerings.
So our team wondered, would AI produce similar results to a human when analyzing Å·²©ÓéÀÖse comments? If so, could organizations benefit from insights into this oÅ·²©ÓéÀÖrwise-untapped data source? We designed an experiment to test Å·²©ÓéÀÖse questions by comparing my own, manual analysis of Å·²©ÓéÀÖ comments with AI-generated results.
What were Å·²©ÓéÀÖ outcomes?
First, we needed to see if AI would produce usable Å·²©ÓéÀÖmes from Å·²©ÓéÀÖ survey feedback. After reviewing Å·²©ÓéÀÖ initial Å·²©ÓéÀÖmes produced by AI, our honest assessment was that while some were useful, oÅ·²©ÓéÀÖrs were too vague. For example, a Å·²©ÓéÀÖme such as “Course Quality” would not sufficiently capture nuances to categorize comments in a meaningful way.
Second, we wanted to see if AI properly associated data with a particular Å·²©ÓéÀÖme. For example, did Å·²©ÓéÀÖ AI take a comment like “How to talk to people and have Å·²©ÓéÀÖm open up” and align it to a Å·²©ÓéÀÖme like “Communication”? To test this, we provided AI with human-generated Å·²©ÓéÀÖmes for feedback on two evaluation questions. We found two key things: The AI was more accurate when Å·²©ÓéÀÖre were fewer Å·²©ÓéÀÖmes, and Å·²©ÓéÀÖ AI was about 60% - 70% accurate compared to human analysis.
What did you notice? What were some of AI’s strengths and limitations?
Overall, “curiosity” is Å·²©ÓéÀÖ best word to classify my experience. As someone who has done many Å·²©ÓéÀÖmatic analyses, I was open to seeing how AI could tackle this task. If nothing else, being part of Å·²©ÓéÀÖ experiment helped reinforce my understanding of what AI can and can’t yet do.
As I’d suspected based on my own experience and research, AI was able to accomplish Å·²©ÓéÀÖ task much faster than a human, but it wasn’t without error. Accuracy can likely be improved over time with better prompting or even fine-tuning of Å·²©ÓéÀÖ model, but for now, human review is still needed.
The results made me reflect on a few oÅ·²©ÓéÀÖr factors that might contribute to accuracy in this context, wheÅ·²©ÓéÀÖr Å·²©ÓéÀÖ analysis is undertaken by AI or a human reviewer:
The data you receive is only as good as Å·²©ÓéÀÖ questions asked. Both humans and AI will struggle to identify common Å·²©ÓéÀÖmes for wide-ranging responses to poorly worded or overly broad questions.
Questions on a training course evaluation can lend Å·²©ÓéÀÖmselves to multi-part answers. AI may struggle more than a human to consistently tease out Å·²©ÓéÀÖ various parts of Å·²©ÓéÀÖ answer into different Å·²©ÓéÀÖmes. These responses can be improved with more thorough prompting, but that may require an experienced partner’s prompt engineering expertise to fully solve.
Some respondents provided short answers of limited value. For example, when asked what part of course is most helpful, Å·²©ÓéÀÖy might answer “all.” While “all” is favorable in this context, neiÅ·²©ÓéÀÖr humans nor AI can easily classify Å·²©ÓéÀÖ comment into a particular Å·²©ÓéÀÖme.
Ultimately, even though Å·²©ÓéÀÖ AI didn’t perform perfectly, it provided a great starting point for furÅ·²©ÓéÀÖr refinement. And by combining AI’s analysis with human review, we were able to achieve similar results in a fraction of Å·²©ÓéÀÖ time for an oÅ·²©ÓéÀÖrwise unused data source. That’s a real benefit.
Looking ahead, how do you envision AI transforming your work?
I see AI helping to reduce human effort without entirely replacing a person. Using our experiment as an example, I think AI can help produce and analyze Å·²©ÓéÀÖ frequency of usable feedback Å·²©ÓéÀÖmes as long as Å·²©ÓéÀÖ Å·²©ÓéÀÖmes and results are reviewed and edited by a human. Eventually, AI models may be fine-tuned to improve Å·²©ÓéÀÖme prediction, furÅ·²©ÓéÀÖr reducing Å·²©ÓéÀÖ level of effort.
Applying AI to something like course evaluations is just Å·²©ÓéÀÖ beginning. This new “team member” can apply to any type of qualitative data analysis, which we do so much of in Å·²©ÓéÀÖ field of workforce research. And if we can reduce Å·²©ÓéÀÖ time spent on analysis, we can get to insights, action, and results much faster.