Module 1: From Writer to Director
Let AI Handle the Heavy Lifting
Learn to brainstorm ideas, draft documents, and refine content using Gemini and Canvas - with built-in fact-checking to keep you accurate.
Learning Objectives - Use Gemini to brainstorm ideas and overcome the blank page problem
- Refine documents in the Canvas split-screen workspace
- Write effective prompts by specifying role and audience
- Verify AI-generated content using the Double-check feature
- Identify and handle AI hallucinations responsibly
What You'll Learn - Why AI changes the way we write and work
- Gemini as a brainstorming partner
- Writing effective prompts with role and audience
- The Canvas split-screen workspace
- Inline editing: shorten, expand, and restyle text
- AI hallucinations and why they happen
- The Double-check button and source verification
- Keeping the human in the loop
Why AI Changes the Way We Write
The world of business communication is going through its biggest shift in decades. In the past, computers only did exactly what we told them to do with specific instructions. A word processor was essentially a digital typewriter: it helped you format text but offered zero creative input. Today, we have entered an era of generative intelligence. Tools like
Gemini do not just store or retrieve information. They create new things - drafts, outlines, summaries, and even full reports - based on patterns learned from vast amounts of text.
For a professional, this shift changes your role fundamentally. Instead of spending hours on the manual labour of drafting emails, writing proposals, or searching through browser tabs for the right wording, you become a
director. You provide the vision, the context, and the judgment. The AI handles the middle 80 percent of the heavy lifting. Your job is to guide the output and apply the final 20 percent - the human touch that no machine can replicate.
This does not mean AI replaces writers. It means AI removes the drudgery so you can focus on strategy, persuasion, and creativity. A marketing manager who once spent two hours drafting a campaign brief can now produce a solid first draft in ten minutes and spend the remaining time refining the message and tailoring it for the audience. The total quality goes up because the human spends more time on the parts that matter most.
Watch video: Why AI Changes the Way We Write
Key Insight: AI does not replace writers. It removes the drudgery so you can focus on the parts that matter most - strategy, persuasion, and the human touch.
Real-World Example: A marketing manager asks Gemini to draft five campaign brief ideas for a new product launch. Instead of starting from scratch, she reviews the suggestions, picks the strongest angle, and refines it - finishing in 15 minutes instead of two hours.
Think of a writing task you do regularly at work. How much of that task is repetitive drafting, and how much is genuine strategic thinking? How could shifting that ratio change the quality of your output?
Brainstorming with Gemini
We all know the frustration of staring at a blank screen, cursor blinking, with no idea where to begin. Gemini solves this by acting as a brainstorming partner you can talk to in plain language. You can ask it for five ideas for a project, a rough outline for a presentation, or a first draft of a memo. The AI responds instantly with structured suggestions you can build on.
The secret to getting good results from Gemini is being specific in your prompts. A vague request like "write something about marketing" produces generic output. A specific prompt produces dramatically better results. The technique is simple: tell the AI who it should be (its role), who the audience is, and what you need.
For example, instead of saying "write a marketing email," you could say: "You are a senior marketing consultant. Write a 200-word email to small business owners explaining why they should invest in social media advertising. Use a friendly but professional tone." This kind of prompt gives Gemini the context it needs to produce relevant, well-targeted output on the first try.
You can also iterate. If the first response is not quite right, tell Gemini what to change: "Make it shorter," "Add a call to action," or "Use simpler language." Each round gets you closer to a polished result without starting over.
Key Insight: The key to great AI output is a specific prompt: define the role, the audience, and the task. Vague prompts produce vague results.
Real-World Example: Prompt: "You are an HR director. Write a 150-word announcement to all employees about a new flexible work policy starting next month. Keep the tone positive and encouraging." - This prompt includes role, audience, task, length, and tone.
Action step: Take the next email or document you need to write and spend 60 seconds crafting a proper prompt with role, audience, and task before asking Gemini. Compare the result to what you would have written yourself.
The Canvas Split-Screen Workspace
One of the most useful features for document work is
Canvas. It is a split-screen workspace built into Gemini. On one side, you have your document. On the other side, you have the AI assistant ready to help. This design eliminates the frustrating cycle of switching between a writing app and a chatbot in separate windows.
The power of Canvas is in its inline editing capabilities. You can highlight any paragraph in your document and give the AI a specific instruction: "Make this more professional," "Shorten this to two sentences," "Translate this to formal English," or "Explain this in simpler terms." The AI rewrites just that selection while leaving the rest of your document untouched.
This is fundamentally different from copying text into a separate chat window, getting a response, and pasting it back. Canvas keeps everything in one place and preserves your document structure. You can make dozens of small refinements without losing your place or accidentally overwriting something.
Canvas also supports different content types. You can use it for emails, reports, blog posts, proposals, and more. The AI understands context from the surrounding text, so when you ask it to "expand this paragraph," it writes new content that flows naturally with what comes before and after.
For teams, Canvas is especially valuable because it creates a consistent workflow. Instead of each person using a different approach to AI-assisted writing, everyone can use the same side-by-side environment. This makes it easier to train new team members on AI-assisted writing and maintain quality standards across the organisation.
Watch video: The Canvas Split-Screen Workspace
Key Insight: Canvas keeps your document and AI assistant side by side. Highlight any text and give specific instructions - the AI edits just that selection without touching the rest.
Do you think a side-by-side writing workspace like Canvas would change how often you use AI for document work? What has been your biggest frustration with switching between tools when writing?
AI Hallucinations and the Double-Check Button
AI is powerful, but it has a significant weakness: it can sometimes hallucinate. In AI terminology, a hallucination is when the model generates information that sounds completely plausible and confident but is actually false. It might cite a statistic that does not exist, reference a study that was never published, or state a "fact" that contradicts reality.
Hallucinations happen because AI models like Gemini do not "know" facts the way humans do. They predict the most likely next word based on patterns in their training data. Sometimes those patterns produce text that is statistically probable but factually wrong. The AI has no internal sense of truth - it cannot distinguish between generating a correct fact and generating a convincing-sounding fiction.
This is why you must always be the final editor. Never publish or share AI-generated content without reviewing it first. For critical documents like financial reports, legal contracts, or medical information, human verification is not optional - it is essential.
Gemini includes a built-in safeguard called the Double-check button. When you click it, Gemini uses Google Search to verify its own claims against real web sources. It then highlights the text using a colour code: green means it found supporting evidence online, and orange means it could not find a source to back up the claim. This does not guarantee accuracy, but it gives you a quick visual map of which parts of the response are well-supported and which need your attention.
The Double-check feature is particularly useful for research summaries, data-heavy content, and any text where accuracy is critical. It turns fact-checking from a time-consuming manual process into a one-click starting point. You should still verify orange-highlighted claims manually, but the tool saves significant time by showing you exactly where to focus your attention.
Watch video: AI Hallucinations and the Double-Check Button
Key Insight: The Double-check button uses Google Search to verify AI claims. Green text means evidence was found online. Orange text means no supporting source was found - verify these claims manually.
Real-World Example: You ask Gemini to summarise market trends in renewable energy. The response mentions a specific percentage growth figure. You click Double-check: the growth figure turns green (verified), but a claim about a government policy turns orange. You search manually and find the policy detail was slightly inaccurate, so you correct it before sharing.
Have you ever shared or acted on information that turned out to be inaccurate? In your field, what are the highest-stakes situations where an AI hallucination could cause real damage?
Keeping the Human in the Loop
The concept of keeping the human in the loop is central to responsible AI use. It means that no matter how good the AI output looks, a human must review, edit, and approve the final result before it reaches its audience. This is not just about catching errors - it is about maintaining quality, authenticity, and accountability.
There are three practical reasons why the human must always remain in control. First, accuracy: as we discussed, AI can hallucinate. A human reviewer catches factual errors that the AI misses. Second, tone and context: AI does not understand your company culture, your relationship with the recipient, or the political dynamics of a situation. A perfectly grammatical email can still be tone-deaf if it does not account for context that only a human would know. Third, accountability: when something goes wrong with a communication, the sender is responsible - not the AI. You cannot tell a client, "Sorry, the AI wrote that." The human must own the output.
A practical workflow for AI-assisted writing follows a simple pattern. Start by giving the AI a clear, specific prompt with role, audience, and task. Review the output for factual accuracy. Edit for tone, context, and your personal style. Use the Double-check button for any factual claims. Then do a final read as if you were the recipient. This five-step process keeps the AI as a powerful assistant while ensuring the human remains the decision-maker.
The goal is not to avoid using AI out of fear. The goal is to use it confidently, knowing that your judgment shapes the final product. Professionals who master this balance become dramatically more productive without sacrificing quality or trust.
Key Insight: Three reasons to keep the human in the loop: accuracy (catch hallucinations), tone and context (understand the real situation), and accountability (you own the output, not the AI).
Of the three reasons to keep the human in the loop, which one feels most relevant to your own work right now - accuracy, tone and context, or accountability? Why?
Module 2: Generate Creative Media
Images, Video, and Music on Demand
Create professional images, short videos, and background music using Nano Banana, Veo, and Lyria - no design skills required.
Learning Objectives - Generate high-quality images with style control using Nano Banana 2
- Create short video clips from text descriptions using Veo 3.1
- Produce background music and audio tracks using Lyria 3
- Apply style consistency techniques for branding and storyboarding
- Understand SynthID watermarking and responsible AI media use
What You'll Learn - From expensive designers to instant AI media creation
- Nano Banana 2: image generation with style control
- Subject consistency and text rendering for business use
- Veo 3.1: cinematic video generation with realistic physics
- Lyria 3: music and audio generation from text descriptions
- SynthID watermarking for responsible AI media
- Practical applications for presentations and marketing
- Resolution options from social media to 4K production
Instant Images with Nano Banana 2
High-quality visuals used to require expensive designers or hours of learning complex software.
Nano Banana 2 changes this by letting anyone create professional images just by describing what they want in plain language. You type a description, and the tool creates a high-quality picture in seconds.
What makes Nano Banana 2 particularly powerful for business is its
style control. You can specify exactly the visual style you want - "clean corporate style," "watercolour look," "flat illustration," or even "Synthetic Cubism" - and the tool will maintain that exact aesthetic across multiple images. This consistency is crucial for branding, where every visual needs to feel like part of the same family.
Nano Banana 2 combines advanced world knowledge with studio-quality creative control, built on the speed of Gemini Flash. It excels at
subject consistency, capable of preserving the exact appearance of up to five characters and maintaining the fidelity of 14 different objects in a single scene without getting confused. This makes it ideal for storyboarding, where you need the same characters appearing across multiple frames.
The tool also features
precise text rendering and translation for marketing mockups. If you need an image of a billboard with specific text on it, Nano Banana 2 can render that text clearly and legibly - something earlier image generators struggled with badly. Resolution options range from 512 pixels for quick social media posts up to full
4K resolution for production-ready assets.
For everyday business use, this means you can create presentation graphics, social media visuals, product mockups, and marketing materials without waiting for a designer or learning Photoshop. A single prompt produces a usable image that you can immediately drop into a slide deck or social post.
Watch video: Instant Images with Nano Banana 2
Key Insight: Nano Banana 2 offers style control, subject consistency for up to 5 characters and 14 objects, precise text rendering, and resolutions from 512px to 4K - all from a text description.
Real-World Example: A trainer creating a slide deck types: "A clean corporate illustration of a diverse team collaborating around a digital whiteboard, flat design style, blue colour palette." Nano Banana 2 generates a polished image matching their brand guidelines in seconds.
Action step: Think of one presentation or social media post you are working on this week. Write a specific image prompt with style, mood, and subject details, then try it in Nano Banana 2 to see how well it matches your vision.
Cinematic Video with Veo 3.1
Video content is increasingly essential for business communication, but professional video production has traditionally required cameras, lighting, editing software, and significant time investment.
Veo 3.1 changes this equation by letting you create short video clips simply by describing the scene you want.
Veo 3.1 is engineered to meet real-world production demands. It generates stunning
4K output and supports configurable aspect ratios, including
16:9 landscape for presentations and YouTube, and
9:16 portrait for mobile and social media stories. The clips are typically around 8 seconds long - perfect for social media content, presentation transitions, or product demos.
What sets Veo 3.1 apart is its understanding of real-world physics. The tool accurately simulates
light, shadow, and physical movement, making the generated videos look remarkably realistic rather than obviously artificial. If you describe a scene with water, the reflections behave naturally. If you describe someone walking, the motion looks fluid and natural.
Veo 3.1 also delivers rich
native audio directly in the generated clips. This means the video comes with appropriate ambient sounds built in, saving you the additional step of sourcing and syncing audio separately. For a professional creating quick content, this is a significant time-saver.
Watch video: Cinematic Video with Veo 3.1
Key Insight: Veo 3.1 generates 4K video clips with configurable aspect ratios (16:9 and 9:16), realistic physics simulation, and built-in native audio - all from a text description.
Real-World Example: A consultant needs a short video for a LinkedIn post showing a sunrise over a modern city skyline. She types the description into Veo 3.1 and gets an 8-second clip with realistic lighting, natural cloud movement, and ambient city sounds - ready to upload immediately.
In your opinion, does being able to generate 4K video from a text description change what kinds of content a small team or solo professional can realistically create? What opportunities does that open up for you?
Music and Audio with Lyria 3
Music is often the missing piece that turns a good presentation or video into a great one. Background music sets the mood, maintains attention, and makes content feel polished and professional. But licensing music or hiring a composer has traditionally been expensive and time-consuming.
Lyria 3 solves this by generating professional-grade audio from text descriptions.
Using Lyria 3 is as simple as describing the mood you want. You can say "uplifting and fast-paced" for an energetic product launch video, "calm and reflective" for a corporate wellness presentation, or "an upbeat birthday tune" for a celebration video. Lyria generates tracks that match your description, typically producing
30-second background tracks perfect for presentations, social media, or short videos.
Lyria 3 handles
high-fidelity audio and music generation with impressive versatility. You can specify vocal styles, acoustic preferences, and instrumental choices. Want a track with soft piano and ambient strings? Just describe it. Need an electronic beat with a driving bassline? Describe that instead. The AI understands musical concepts and translates your plain-language description into a coherent, professional-sounding track.
One particularly creative feature is the ability to
upload an image and ask Lyria to transform it into a custom track. A photo of a beach sunset might produce a calm, wave-like ambient piece. A photo of a bustling market might produce an upbeat, rhythmic track. This image-to-music capability opens up creative possibilities that were previously impossible without musical training.
For business professionals, Lyria 3 means you can add custom audio to presentations, training videos, social media content, and internal communications without any music licensing concerns or production costs.
Watch video: Music and Audio with Lyria 3
Key Insight: Lyria 3 generates 30-second professional audio tracks from text descriptions. You can specify mood, vocal styles, instruments, and even transform an uploaded image into a custom music track.
How could custom background music change the feel of your next presentation, training session, or video? Describe the mood you would want in your own words.
SynthID: Responsible AI Media
As AI-generated media becomes increasingly realistic, a critical question arises: how do you tell the difference between content created by a human and content created by AI? This matters for trust, intellectual honesty, and preventing misinformation. Google addresses this challenge with
SynthID watermarking technology.
SynthID works by embedding an
imperceptible digital watermark into AI-generated content. This watermark is invisible to the human eye in images and inaudible in audio, but it can be detected by specialised tools. It marks the content so that anyone can verify whether media was created or edited using Google AI tools.
The watermark is designed to be
robust - it survives common modifications like cropping, resizing, compressing, and adding filters. This means that even if someone downloads an AI-generated image and modifies it before sharing, the SynthID watermark remains detectable. This durability is important for accountability in professional and media contexts.
For business professionals, SynthID provides several practical benefits. First, it helps maintain
transparency with clients and audiences. If you use AI-generated visuals in a presentation, the watermark serves as an honest signal about how the content was produced. Second, it provides
protection against misuse. If someone takes your AI-generated content and uses it in a misleading context, the watermark can help trace its origin. Third, it supports emerging
regulatory requirements around AI content disclosure that many jurisdictions are developing.
Google embeds SynthID across its creative tools, including Nano Banana 2, Veo 3.1, and Lyria 3. This means all content generated through these tools is automatically watermarked without any extra steps from the user. The watermarking happens in the background, adding no friction to the creative process.
Key Insight: SynthID embeds invisible, robust watermarks in all AI-generated media - images, video, and audio - enabling verification of AI origin even after cropping, resizing, or compression.
Real-World Example: A marketing team generates product images using Nano Banana 2. When a client asks whether the images are AI-generated, the team can verify this using SynthID detection tools, maintaining transparency and trust in the business relationship.
Do you think professionals have an obligation to disclose when they use AI-generated images or videos in client work? Where would you draw the line between helpful and misleading?
Practical Applications for Business
Understanding the individual tools is important, but the real power emerges when you combine them for practical business tasks. Each tool addresses a different media need, and together they cover the full spectrum of content creation that professionals encounter daily.
For presentations, you can use Nano Banana 2 to generate custom illustrations that match your brand style, Veo 3.1 to create short video transitions or product demo clips, and Lyria 3 to add professional background music. A presentation that would have required a designer, a videographer, and a music licence can now be produced by one person in a fraction of the time.
For social media marketing, the combination is equally powerful. Create eye-catching images for Instagram, short video clips for TikTok or LinkedIn, and audio tracks for podcast intros or YouTube videos. The style control in Nano Banana 2 ensures all your visual content maintains brand consistency, while the portrait mode in Veo 3.1 produces mobile-optimised video content.
For training and education, these tools transform how instructional content is created. Instead of searching stock photo libraries for generic images, you can generate exactly the visual you need to illustrate a specific concept. Instead of recording and editing video manually, you can create short explanatory clips that visualise abstract processes or scenarios.
The key to effective use is starting with a clear brief for each piece of content. Just as a specific prompt produces better writing from Gemini, a specific description produces better media from Nano Banana, Veo, and Lyria. Include details about style, mood, composition, and intended use to get the best results on the first try.
Key Insight: Combining Nano Banana 2 (images), Veo 3.1 (video), and Lyria 3 (audio) lets one person create a complete multimedia package for presentations, social media, or training content.
Action step: Identify one upcoming piece of content - a presentation, social post, or training material - and plan which combination of Nano Banana 2, Veo 3.1, and Lyria 3 you could use to make it more engaging.
Module 3: Simplify Information Research
From Data Overload to Clear Insights
Master Deep Research for web exploration, NotebookLM for private document analysis, and Audio Overviews for learning on the go.
Learning Objectives - Conduct multi-step web research using Deep Research as a personal search agent
- Analyse private documents with high accuracy using NotebookLM
- Understand source grounding and why it reduces AI hallucinations
- Transform documents and research into podcast-style Audio Overviews
- Choose the right research tool for different information needs
What You'll Learn - The challenge of information overload in modern work
- Deep Research: automated multi-step web exploration
- Monitoring the AI thinking process in real time
- NotebookLM: source-grounded analysis of private data
- How grounding reduces hallucinations
- Audio Overviews: podcast-style learning from documents
- Choosing the right tool: Deep Research vs NotebookLM vs Guided Learning
- Practical research workflows for professionals
Deep Research: Your Personal Search Agent
Traditional web searching gives you a list of blue links. You click them one by one, read through each page, take notes, compare information across tabs, and eventually piece together an answer. For a simple question this works fine, but for complex research tasks - like evaluating a competitor, understanding a new regulation, or exploring a market trend - this manual process takes hours.
Deep Research works fundamentally differently. It is an autonomous "agent" that takes your question, creates a research plan, searches dozens of websites on your behalf, reads and synthesises the information, and then writes a
comprehensive report with citations linking back to the original sources. Instead of you doing the browsing, the AI does it for you.
What makes Deep Research particularly powerful is its transparency. While the model is working, you can watch its
"thinking" steps in real time. You see which websites it is visiting, what questions it is exploring, and how it is building its understanding. This is not a black box - you can follow the AI's research process just as you would follow a human research assistant's progress.
Once the report is generated, the conversation does not end. You can ask
follow-up questions to add new information, explore a specific angle in more depth, or ask the AI to compare two findings. The report updates in real time, making Deep Research a living document rather than a static output.
Deep Research is ideal for tasks that require broad understanding across multiple sources: competitive analysis, industry trend reports, regulatory overviews, market entry research, or finding the best options for a specific need like summer camps or software tools.
Watch video: Deep Research: Your Personal Search Agent
Key Insight: Deep Research acts as your personal search agent: it creates a plan, browses dozens of websites, and writes a comprehensive report with citations - saving hours of manual browsing.
Real-World Example: A consultant needs to understand how three competitors position their AI training services. Instead of spending half a day visiting each competitor's website, she asks Deep Research to compare their offerings, pricing models, and target audiences. In minutes, she has a structured report with source links.
What research task have you recently done manually that took you more than an hour? How do you think Deep Research would have changed that experience in terms of time and quality?
NotebookLM: Your Smart Notebook for Private Data
NotebookLM is a tool designed for a fundamentally different research task: analysing your own private documents rather than searching the open web. You upload your files - PDFs, Google Docs, YouTube transcripts, or other text sources - and NotebookLM becomes an AI assistant that only looks at the material you have provided.
This design principle is called
source grounding. Because the AI is constrained to your specific data and cannot draw on its general training knowledge, it is much less likely to hallucinate or make things up. When NotebookLM answers a question, it points to the exact passage in your uploaded documents where it found the information. This traceability is crucial for professional work where accuracy matters.
A single NotebookLM notebook can handle approximately
25 million words of source material. This enormous capacity means you can upload hundreds of pages of project notes, research papers, policy documents, or meeting transcripts and have the AI extract the key points, find connections, and answer questions across all of them simultaneously.
Practical applications are wide-ranging. You can upload a 200-page annual report and ask NotebookLM to summarise the key financial trends. You can upload all the meeting notes from a quarter and ask it to identify recurring themes or unresolved action items. You can upload training manuals and ask it to create study guides or FAQs. In every case, the AI works exclusively from your data, giving you high-confidence answers grounded in your actual documents.
Watch video: NotebookLM: Your Smart Notebook for Private Data
Key Insight: NotebookLM is source-grounded: it only analyses the documents you upload, dramatically reducing hallucinations. A single notebook handles up to 25 million words of source material.
Real-World Example: An HR manager uploads the company's 150-page employee handbook, 30 pages of recent policy updates, and last quarter's meeting notes. She asks NotebookLM: "What are the key changes to our leave policy this year?" The AI gives a precise answer citing the exact pages and documents.
Action step: Think of a large document or set of reports that sits unread on your desk or drive. What three questions would you immediately ask NotebookLM if you could upload that material right now?
Audio Overviews: Learning on the Go
One of the most innovative features shared by both NotebookLM and Deep Research is the ability to transform documents and research into Audio Overviews. With a single click, these tools generate an engaging, podcast-style audio discussion where two AI hosts have a dynamic back-and-forth conversation about your material.
The AI hosts do not simply read the text aloud. They summarise the material, draw connections between topics, highlight important points, and provide unique perspectives. The conversation format makes complex information more digestible because you hear ideas being discussed and explained rather than just listed. It feels like listening to two knowledgeable colleagues discussing the material over coffee.
Audio Overviews are generated in your selected output language and can be downloaded or shared. This makes them an excellent tool for learning on the go. You can generate an Audio Overview of a lengthy research report and listen to it during your commute. You can create one from a training manual and absorb the key points while exercising. The ability to multitask while learning is a significant productivity boost.
For teams, Audio Overviews serve as a powerful way to distribute knowledge. Instead of asking every team member to read a 50-page report, a manager can generate a 10-minute audio summary and share it. Team members get the key insights without the reading burden, and they can always refer back to the original document for details.
The audio format also helps with retention. Research consistently shows that hearing information discussed in a conversational format helps people remember it better than reading it silently. By transforming dense written material into an engaging dialogue, Audio Overviews make it more likely that the information actually sticks.
Watch video: Audio Overviews: Learning on the Go
Key Insight: Audio Overviews transform documents into podcast-style conversations with two AI hosts who summarise, connect ideas, and explain key points - available for download and sharing.
Real-World Example: A team lead receives a 40-page industry analysis report. Instead of scheduling a meeting to discuss it, she generates an Audio Overview and shares the link with her team. Everyone listens during their commute and arrives at the next meeting already informed on the key findings.
When during your week do you have time that is currently "wasted" but could become learning time if you had good audio content? How could Audio Overviews fit into that gap?
Choosing the Right Research Tool
With multiple research tools available, knowing which one to use for a given task makes the difference between efficient research and wasted time. The three main tools each serve a distinct purpose, and understanding their strengths helps you choose correctly every time.
Deep Research is your choice when you need to explore the open web broadly. Use it for competitive analysis, industry trends, market research, regulatory overviews, or any question where the answer is scattered across many public websites. Deep Research excels at synthesis - pulling together information from dozens of sources into one coherent report. Its "thinking" transparency lets you guide the research process and ask follow-up questions.
NotebookLM is your choice when you need to analyse your own private documents. Use it when you have specific files - reports, manuals, meeting notes, contracts, research papers - and need to extract insights, answer questions, or create summaries from that material. NotebookLM's source grounding gives you high confidence that the answers come directly from your data, with citations pointing to exact passages.
Guided Learning is your choice when you want to understand a topic deeply rather than just gather information. It acts like a tutor, asking you questions to test your understanding and explaining concepts until they click. Use it when you are learning something new and want to ensure you truly understand the material rather than just reading through it.
All three tools support Audio Overviews, so regardless of which research path you choose, you can transform the output into a podcast-style discussion for review on the go. The key decision is whether your information need is about broad web exploration (Deep Research), private document analysis (NotebookLM), or personal learning and comprehension (Guided Learning).
Key Insight: Use Deep Research for broad web exploration, NotebookLM for private document analysis, and Guided Learning for interactive tutoring - all three support Audio Overviews.
Think of a real research task you faced in the last month. Which tool - Deep Research, NotebookLM, or Guided Learning - would have been the right fit, and why?
Practical Research Workflows
Knowing the tools is important, but building effective research workflows is what transforms occasional tool use into consistent productivity gains. Here are practical workflows that professionals can adopt immediately.
For competitive intelligence, start with Deep Research to get a broad overview of your competitors' positioning, pricing, and messaging. Watch the thinking steps to ensure the AI is exploring the right angles. Once you have the report, ask follow-up questions to dig deeper into specific areas. Then create an Audio Overview to share the key findings with your team.
For policy or compliance review, upload all relevant documents to NotebookLM - regulations, company policies, guidelines, and previous audit reports. Ask the AI to identify gaps between your current practices and regulatory requirements. Because NotebookLM is source-grounded, it will cite exactly which regulation requires what and which company document addresses (or fails to address) it.
For preparing presentations, combine both tools. Use Deep Research to gather external data - market trends, statistics, industry benchmarks. Use NotebookLM to analyse your internal data - sales figures, customer feedback, project outcomes. Together, they give you a complete picture combining external context with internal reality.
For onboarding new team members, upload all training materials, process documents, and team guidelines to NotebookLM. New hires can ask questions about company processes and get accurate, grounded answers with citations to the relevant documents. Generate Audio Overviews of the most important materials so new team members can absorb essential information during their first week.
The common thread across all these workflows is that AI handles the time-consuming collection and synthesis work, while the human focuses on interpreting insights, making decisions, and taking action.
Key Insight: Effective research workflows combine Deep Research (external data) and NotebookLM (internal documents) to get a complete picture, with Audio Overviews to share findings efficiently.
Real-World Example: Preparing for a board presentation: use Deep Research to gather industry benchmarks and competitor data, upload internal sales reports to NotebookLM for trend analysis, combine both sets of insights into the presentation, and generate Audio Overviews for board members who want a preview.
Action step: Choose one of the four workflow examples (competitive intelligence, policy review, presentation prep, or onboarding) and write down exactly how you would apply it in your current role this week.
Module 4: Automate Daily Workflow
Less Busywork, More Impact
Use the Gemini Chrome side panel, auto browse for multi-step tasks, and custom Gems to eliminate repetitive work from your day.
Learning Objectives - Use the Gemini side panel in Chrome to multitask without interrupting primary work
- Compare data across multiple browser tabs using AI summarisation
- Automate multi-step web tasks with Chrome auto browse
- Create custom Gems as reusable specialised AI personas
- Integrate AI into daily email and data analysis workflows
What You'll Learn - Productivity lost to small, repetitive tasks
- The Gemini Chrome side panel for browsing assistance
- Comparing and summarising data across multiple tabs
- Chrome auto browse for multi-step web automation
- Safety features: confirmation before sensitive actions
- Custom Gems: building specialised AI personas
- Integrating AI into Google Workspace (email, data)
- Building a personal AI-augmented workflow
The Gemini Chrome Side Panel
Productivity is often lost not to big tasks but to small, repetitive ones: comparing prices across websites, summarising articles, finding details buried in long pages, or pulling data from multiple tabs. The Gemini side panel in Chrome addresses this by keeping an AI assistant available at all times while you browse, without interrupting your primary work.
The side panel stays open on the right side of your browser while your main content fills the left. This means you can keep your primary workspace - whether it is a document, a spreadsheet, or a web application - fully visible while using the AI for quick tasks. There is no need to switch tabs or open a separate window.
One of the most powerful use cases is cross-tab comparison. If you have five different software review pages open, you can ask the side panel to "create a table comparing the prices and features of all these tabs." The AI reads the content from each open tab and synthesises it into a structured comparison without you having to manually copy and paste information between pages.
The side panel is equally useful for quick summarisation. When you land on a long article or report, instead of reading the entire thing to find the relevant section, you can ask the AI: "What are the three main points of this page?" or "Does this article mention anything about remote work policies?" The AI scans the page content and gives you a direct answer.
You can also transform images on the fly using the integrated Nano Banana model directly in the browser. If you find an image that is close to what you need but not quite right, you can ask the side panel to modify it - changing colours, adding elements, or adjusting the style - without leaving your browser.
Watch video: The Gemini Chrome Side Panel
Key Insight: The Gemini Chrome side panel stays open while you work, letting you compare data across tabs, summarise pages, and transform images without switching windows or interrupting your workflow.
Real-World Example: A procurement manager has 8 vendor proposals open in different tabs. She asks the side panel: "Create a comparison table of pricing, delivery timelines, and warranty terms for all open tabs." In seconds, she has a structured table without manually reading through each 20-page proposal.
Action step: Next time you have five or more browser tabs open for research or comparison, try asking the Gemini Chrome side panel to summarise or compare them. Note how long it takes versus your usual approach.
Chrome Auto Browse: Your Web Automation Agent
While the side panel handles quick questions about what you are currently viewing, Chrome auto browse goes much further. It is a powerful agentic feature that handles multi-step tasks on your behalf, navigating across websites, filling out forms, and collecting information without you clicking through each step manually.
Auto browse can handle surprisingly complex tasks. It can optimise vacation planning by researching flight and hotel costs across multiple dates and destinations, comparing options, and presenting the best combinations. It can fill out tedious online forms that require navigating through multiple pages of input fields. It can collect tax documents from various financial institutions. It can gather quotes from local professionals by visiting multiple service provider websites and compiling their pricing.
What makes auto browse particularly practical is its authentication capability. It can use Google Password Manager to navigate tasks that require signing in to websites. This means it can access your accounts (with your stored credentials) to perform tasks that would otherwise require you to log in manually to each service.
Critically, auto browse is designed with safety in mind. It pauses and asks for your explicit confirmation before executing sensitive actions like making a purchase, submitting a payment, or any other irreversible action. This means the AI handles the tedious research and navigation, but you remain in control of all final decisions.
Key Insight: Chrome auto browse handles multi-step web tasks: researching flights, filling forms, collecting documents, and gathering quotes - but always pauses for your confirmation before sensitive actions.
Real-World Example: A business owner asks auto browse to "find the best flight and hotel options for a 3-day trip to Singapore next month, comparing at least 5 options for each." The AI browses travel sites, compiles options, and presents a comparison - pausing before any booking for the owner's approval.
What multi-step browsing task do you do repeatedly each month - gathering quotes, comparing products, or filling out forms? How much time could auto browse reclaim for you each week?
Custom Gems: Your Specialised AI Experts
One of the most practical features for daily productivity is the ability to create
Custom Gems. A Gem is a custom version of Gemini that you "train" with specific instructions tailored to your needs. Think of each Gem as a specialised AI persona that already knows your context, preferences, and required output format.
Creating a Gem is straightforward. You write a set of instructions that define who the Gem is, what it knows, and how it should respond. For example, you could create a
"Hiring Expert Gem" that knows your company's specific culture, hiring criteria, and evaluation standards. You tell it to always format its feedback as a list of pros and cons with a final recommendation. Once built, this Gem is available anytime you need candidate feedback - without retyping all those instructions.
The power of Gems lies in
consistency and reusability. Without Gems, every time you want the AI to perform a specialised task, you need to include all the context and instructions in your prompt. This is tedious and error-prone - you might forget an important detail, or different team members might give different instructions for the same task. A Gem ensures that every interaction starts with the same baseline instructions, producing consistent output regardless of who uses it.
Practical Gem examples span every business function. A
"Customer Feedback Analyst" Gem could be trained to categorise customer comments into themes and flag urgent issues. A
"Meeting Notes Formatter" Gem could turn raw notes into structured action items with owners and deadlines. A
"Social Media Writer" Gem could know your brand voice and always produce posts in your company's style with appropriate hashtags.
Gems can be personal (just for you) or shared with your team. Shared Gems become powerful tools for
standardising processes across an organisation. When the entire customer service team uses the same "Complaint Response" Gem, every customer gets a consistent, professional response that follows company guidelines.
Watch video: Custom Gems: Your Specialised AI Experts
Key Insight: Custom Gems are reusable AI personas with pre-loaded instructions. Build them once, reuse them anytime - ensuring consistent output without re-typing context and instructions each time.
Real-World Example: A team creates a "Weekly Report Writer" Gem with instructions: "You are our department's report writer. Format all reports with these sections: Summary, Key Metrics, Highlights, Challenges, Next Week Priorities. Use data-driven language and keep it under 500 words." Every Monday, the team feeds raw data into the Gem and gets a consistently formatted report.
If you could build one Custom Gem for your team right now, what would it do? Write a one-paragraph description of its role, what it knows, and what format it should always follow.
AI in Google Workspace
Beyond the browser, Google has integrated AI capabilities directly into the productivity applications that professionals use every day. This integration means you do not need to leave your email, spreadsheet, or document to access AI assistance - it is embedded right where you work.
In Gmail, AI can help you draft replies, summarise long email threads, and suggest responses based on the conversation context. Instead of reading through a 15-message thread to understand the current status, you can ask the AI to summarise it. For replies, you can describe what you want to say and the AI drafts a response that matches the tone and context of the conversation.
In Google Sheets, AI helps with data analysis that previously required advanced formula knowledge. You can describe what you want in plain language: "Create a pivot table showing monthly sales by region" or "Highlight all cells where the value decreased compared to last month." The AI translates your request into the appropriate formulas and formatting.
In Google Docs, the AI writing assistance goes beyond what Canvas offers in the Gemini interface. You can generate content, rewrite sections, change tone, and format documents - all within the familiar Docs environment. This is particularly useful for collaborative documents where multiple team members need to contribute and edit.
The key benefit of workspace integration is zero context-switching. Instead of copying data from a spreadsheet into Gemini, getting a response, and pasting it back, you work directly within the application. The AI understands the context of your document, spreadsheet, or email, which means it can provide more relevant and accurate assistance.
For teams already using Google Workspace, this integration represents the lowest-friction path to AI adoption. There is nothing new to install, no new interface to learn, and no workflow disruption. The AI simply enhances the tools people already know how to use.
Watch video: AI in Google Workspace
Key Insight: Google Workspace AI integration embeds assistance directly in Gmail, Sheets, and Docs - enabling AI-powered drafting, analysis, and formatting with zero context-switching.
Which Google Workspace app do you use most - Gmail, Sheets, or Docs? How specifically could the AI integration in that app save you time on tasks you do every day?
Building Your Personal AI Workflow
The individual tools - Chrome side panel, auto browse, Gems, and Workspace integration - are powerful on their own. But the real productivity transformation happens when you combine them into a personal AI-augmented workflow tailored to your specific role and tasks.
Start by identifying the repetitive tasks that consume your time each day. Email management, data comparison, report formatting, information gathering, and routine communications are common candidates. For each task, ask yourself: could the Chrome side panel help? Could a Gem handle this? Could auto browse automate the browsing steps?
A practical approach is to build your workflow in layers. Layer 1 is the Chrome side panel for quick, in-the-moment assistance: summarising pages, comparing tabs, and answering questions about what you are currently viewing. Layer 2 is Custom Gems for recurring specialised tasks: formatting reports, analysing feedback, drafting communications in your company's voice. Layer 3 is auto browse for multi-step web tasks that you currently do manually: research, data collection, form filling.
The common principle across all these tools is that AI handles the repetitive execution while you provide the judgment and decision-making. You do not need to understand how the AI works technically. You just need to clearly describe what you want done, review the output, and make the final call.
Teams benefit even more when workflows are standardised. Create shared Gems for common tasks, establish conventions for how the side panel should be used, and document which types of tasks are good candidates for auto browse. This shared approach means the entire team benefits from AI automation, not just the early adopters who figured it out individually.
Key Insight: Build your AI workflow in three layers: Chrome side panel for quick tasks, Custom Gems for recurring specialised work, and auto browse for multi-step web automation.
Real-World Example: A sales manager's daily workflow: uses the Chrome side panel to summarise competitor pages during research, has a "Proposal Writer" Gem that formats all sales proposals in the company template, and uses auto browse to gather pricing from supplier websites before quarterly negotiations.
Map out your own three-layer AI workflow: which specific tasks would live in each layer? Start with Layer 1 and be as concrete as possible about the daily tasks you would tackle first.
Module 5: Build an AI App
From Idea to Working Prototype
Turn your ideas into functional web apps using vibe coding in Google AI Studio - no programming experience required.
Learning Objectives - Write a Product Requirements Document (PRD) as a blueprint for an AI app
- Use vibe coding in Google AI Studio Build Mode to create app prototypes
- Understand the Antigravity Agent and how it manages full-stack development
- Create quick prototypes using Gemini Canvas for HTML, React, or Python
- Deploy and share completed apps via Cloud Run, ZIP export, or GitHub
What You'll Learn - Why professionals can now build apps without coding
- The Product Requirements Document (PRD) as a blueprint
- Vibe coding: describing apps in plain English
- Google AI Studio Build Mode and live previews
- The Antigravity Agent: managing code and dependencies
- Server-side capabilities and secrets management
- Quick prototyping with Gemini Canvas
- Deployment options: Cloud Run, ZIP, and GitHub
The Product Requirements Document (PRD)
Every good app starts with a plan. In the software world, this plan is called a Product Requirements Document (PRD). Despite the technical-sounding name, a PRD is essentially a clear description of what your app does, who it is for, and what features it needs. Think of it as a detailed brief that gives the AI (or a human developer) everything it needs to build what you envision.
A PRD answers five fundamental questions. What is the problem? Define the specific pain point your app addresses. Who is the user? Describe who will use the app and what they need. What are the core features? List the must-have functionality, prioritised by importance. What does success look like? Define how you will know the app is working as intended. What are the constraints? Note any limitations like platform requirements, budget, or timeline.
The beauty of starting with a PRD is that Gemini itself can help you write it. You can describe your app idea in a few sentences and ask Gemini to brainstorm a full PRD. The AI will suggest features you might not have considered, identify potential edge cases, and structure the document in a standard format that makes the build process smoother.
For example, if you say "I want a simple tool that helps my sales team track their weekly call targets," Gemini might expand this into a PRD with features like a dashboard showing progress, a form for logging calls, weekly summary emails, manager view versus rep view, and mobile-friendly design. This expansion ensures you have thought through the user experience before a single line of code is written.
Writing the PRD first saves significant time during development. Without it, you end up in an endless cycle of "actually, can you also add..." requests that lead to confused, bloated apps. With a clear PRD, the AI has a roadmap to follow, producing better results faster.
Key Insight: A PRD answers five questions: What is the problem? Who is the user? What are the core features? What does success look like? What are the constraints? Gemini can help brainstorm and write the PRD for you.
Real-World Example: You tell Gemini: "I want an app that helps our HR team schedule interviews." Gemini generates a PRD with features including calendar integration, candidate status tracking, interviewer availability checker, automated reminder emails, and a dashboard showing upcoming interviews - all from your single sentence.
Think of a small tool or app that would make your work easier. Try answering the five PRD questions - problem, user, features, success, constraints - in writing right now. You may already have your first app idea.
Vibe Coding in Google AI Studio
Vibe coding is a new approach to building software where you describe your app's goal or "vibe" in plain English and the AI writes the code for you. This is done using
Build Mode in Google AI Studio, where you talk to the AI, it generates the necessary code, and you see a
live, interactive preview of your app on screen instantly.
The experience is remarkably intuitive. You start by describing what you want: "Build a task manager with categories and due dates." The AI generates the code and you immediately see the app working in a preview pane. You can then refine it conversationally: "Make the buttons bigger," "Add a search bar at the top," "Change the colour scheme to blue," or "Add a login screen." Each change appears in the preview in real time, giving you instant visual feedback.
This approach is powered by the
Antigravity Agent, which expertly manages context and multi-file dependencies across a full-stack environment. It builds both a
client-side web frontend (like React) and a
server-side Node.js runtime. This means your app is not just a pretty interface - it can connect to databases, call external APIs, and handle real business logic.
The term "vibe coding" captures the philosophy perfectly: you do not need to understand code syntax, programming languages, or software architecture. You just need to clearly describe what you want and iterate on the results. The AI translates your intent into working software.
Watch video: Vibe Coding in Google AI Studio
Key Insight: Vibe coding lets you describe your app in plain English and see it built in real time. The Antigravity Agent handles both frontend (React) and backend (Node.js) code automatically.
Real-World Example: You type: "Build a customer feedback form that saves responses to a database and shows a dashboard with charts." The AI generates the form, the database connection, and the dashboard. You see it working immediately and say: "Add a filter by date range" - done in seconds.
In your opinion, does the idea of building software by describing it in plain English feel believable to you, or does it still feel too good to be true? What would you need to see to be fully convinced?
The Antigravity Agent and Full-Stack Development
Behind the scenes of vibe coding, the Antigravity Agent is the engine that makes it all work. It is an AI system that manages the complexity of building a real application - handling multiple files, code dependencies, frontend and backend components, and external services - so you do not have to think about any of it.
The Antigravity Agent builds across two layers. The client-side frontend is what users see and interact with - the buttons, forms, charts, and layout of your app. It uses modern frameworks like React to create professional, responsive interfaces. The server-side backend runs on Node.js and handles the things that happen behind the scenes: storing data in databases, calling external APIs, processing information, and managing business logic.
One particularly important capability is Secrets Management. If your app needs to connect to external services (like a payment processor, email service, or database), it requires API keys - essentially passwords that authenticate your app. The Antigravity Agent provides a secure way to store these keys on the server side, away from client-side exposure. This is critical for security: API keys that are visible in the browser can be stolen and misused.
The Agent also handles automated installation of npm packages. npm (Node Package Manager) is the standard system for adding pre-built functionality to web applications. When the AI determines your app needs a specific capability - like chart rendering, date formatting, or PDF generation - it automatically installs the right package without you needing to know what npm is or how it works.
For professionals building internal tools, the Antigravity Agent means the gap between "I have an idea for a tool" and "I have a working prototype" has collapsed from weeks or months to hours or even minutes.
Watch video: The Antigravity Agent and Full-Stack Development
Key Insight: The Antigravity Agent manages full-stack development: React frontend, Node.js backend, npm packages, database connections, and secure API key storage - all invisible to you.
The Antigravity Agent handles the technical complexity so you do not have to. Does understanding what it does behind the scenes change how confident you feel about building a real app? Why or why not?
Quick Prototyping with Gemini Canvas
Not every app idea requires the full power of Google AI Studio Build Mode. Sometimes you just need a quick prototype to test a concept, demonstrate an idea to stakeholders, or create a simple interactive element. For these cases, Gemini Canvas offers a streamlined space for generating quick prototypes.
Canvas can generate HTML, React, or Python scripts with live previews. This means you can describe a simple interface - like an email subscription form, a calculator, or a data entry screen - and see it working immediately. You can then request changes conversationally: "Add a call-to-action button," "Change the background colour," or "Make it mobile-friendly."
The key difference between Canvas prototyping and AI Studio Build Mode is scope and complexity. Canvas is ideal for single-page prototypes, simple interactive elements, and quick demonstrations. AI Studio Build Mode is designed for full applications with multiple pages, database connections, and server-side logic. Think of Canvas as a sketch pad and AI Studio as a full workshop.
Canvas prototyping is particularly useful in meetings and presentations. Instead of describing what an interface could look like, you can generate a working prototype in real time. "What if the form looked like this?" becomes a live demonstration rather than an abstract discussion. This dramatically speeds up decision-making because stakeholders can see and interact with the idea rather than imagining it.
For professionals who are not sure whether their app idea warrants a full build, Canvas prototyping serves as an excellent validation step. Create a quick prototype, test the concept with a few users, gather feedback, and then decide whether to invest time in a full build using AI Studio.
Key Insight: Gemini Canvas generates quick HTML, React, or Python prototypes with live previews - ideal for testing concepts, demonstrating ideas in meetings, and validating app ideas before a full build.
Real-World Example: During a product meeting, someone suggests adding a pricing calculator to the website. Instead of scheduling a follow-up to discuss specifications, you open Canvas and say: "Create a pricing calculator with three tiers, monthly/annual toggle, and a comparison table." A working prototype appears in 30 seconds for the team to review and refine.
Action step: Think of your next meeting where a new idea or interface will be discussed. Could you use Gemini Canvas to generate a quick prototype before or during that meeting? What would you build?
Deploying and Sharing Your App
Building an app is only valuable if people can use it. Google AI Studio provides multiple
deployment options so you can share your creation with users, whether they are team members, clients, or the public.
The most powerful option is
Google Cloud Run. This deploys your app as a scalable cloud service with a unique URL that anyone can access. Cloud Run automatically handles traffic - if one person uses your app, it runs on minimal resources; if a thousand people use it simultaneously, it scales up automatically. For internal business tools, this means you can deploy once and the infrastructure handles growth without you managing servers.
For developers or teams with existing workflows, you can
export your app as a ZIP file. This downloads all the source code to your computer so you can continue development locally using traditional tools. This option is useful when you want to use vibe coding for the initial prototype but then hand the project to a development team for refinement.
You can also
push directly to a GitHub repository. GitHub is the standard platform for storing, versioning, and collaborating on code. Pushing to GitHub integrates your AI-built app with existing deployment pipelines - for example, if your company uses automated deployment from GitHub to production, your vibe-coded app can plug directly into that workflow.
For quick sharing without formal deployment, you can generate
shareable preview links directly from AI Studio. These let stakeholders interact with your app prototype without any setup. Simply share the link, and they can test the app in their browser.
The choice of deployment method depends on your use case. For internal tools that need to be always available, Cloud Run is ideal. For handoff to a development team, ZIP export provides the source code. For integration with existing DevOps workflows, GitHub push is the seamless option.
Key Insight: Three deployment options: Google Cloud Run for scalable cloud hosting, ZIP export for local development, and GitHub push for integration with existing workflows. Shareable preview links work for quick stakeholder feedback.
Now that you know you can build and deploy a real app without coding, what is the first internal tool or client-facing app you would build? Who would use it, and what problem would it solve?