Skip to main content
All CollectionsFin AI AgentTrain
How Fin Vision understands images

How Fin Vision understands images

Fin Vision instantly analyzes images to diagnose issues, provide solutions, or capture key details that move the conversation forward.

Beth-Ann Sher avatar
Written by Beth-Ann Sher
Updated over a week ago

Fin Vision is a built-in capability of Fin AI Agent that allows it to analyze and understand images sent by customers - screenshots, photos, documents, and more - directly within conversations via chat or email.

There’s no need to enable or configure anything, and there’s no additional cost.

Fin Vision helps:

  • Diagnose issues faster

  • Eliminate the need for lengthy customer explanations

  • Extract and understand visual content like error messages, receipts, product defects, and more


How Fin Vision works

Fin Vision uses multimodal large language models (LLMs) to understand images. When a customer sends an image, Fin processes it using a vision-enabled LLM to generate a structured textual description. This transcription includes:

  • Extracted text (OCR)

  • UI elements and associated labels

  • Reference numbers, product details, and key highlights

  • Context-aware insights derived from the image

This description is then added to the chat history, which allows Fin to incorporate visual context into its responses.

With this understanding, Fin can:

  • Search your knowledge base more effectively

  • Resolve Tasks that depend on visual information

  • Provide relevant, actionable answers - just like it would from a customer's written input

Note:

  • Fin does not train on or analyze images within your support content (e.g., images embedded in articles). It only processes images actively sent by customers during conversations.

  • Fin currently can't generate or send images when providing AI answers.

  • Fin currently can't read ALT text in images.


Ways to use Fin Vision

Industry

Example use cases

FinTech

  • Error troubleshooting: Screenshots of failed transfers or login issues help Fin provide targeted support.

  • Fraud alert review: Fin helps identify phishing screenshots or suspicious activity.

SaaS

  • Troubleshooting UI bugs: Customers share screenshots of errors or unexpected UI behavior; Fin extracts error messages and provides fixes.

  • Onboarding help: Fin can assist customers through unclear UI flows based on shared screenshots.

  • License verification: Fin reads license keys or account numbers from uploaded invoices.

E-Commerce

  • Return/refund validation: Customers upload images of damaged or incorrect products; Fin evaluates eligibility based on Task instructions.

  • Shipping issues: Customers share photos of packaging or contents; Fin determines missing items or packaging damage.

  • Invoice processing: Fin extracts order numbers and dates from receipts or packing slips.

Gaming/Gambling

  • Bug reporting: Players send screenshots of glitches or crashes; Fin interprets the visuals and logs issues.

  • Withdrawal issues: Customers upload screenshots of failed transactions; Fin pulls timestamps, amounts, and transaction IDs.

  • Bet slip verification: Fin reads and confirms bet slip details from uploaded images.


Maximizing Fin Vision

To get the most from Fin Vision, combine it with Fin’s other features:

Use with Fin Guidance

Use Fin Guidance to instruct Fin to proactively ask for images when needed. You can also guide Fin on what to look for in a screenshot and next steps based on the outcome.

Guidance examples:

  • If a customer shares a screenshot, identify the device type and suggest next steps accordingly.

  • If a user reports an error or other issue with our website, ask for a screenshot showing the error and a link to the page they are on before providing further assistance

  • Ask the customer to provide proof of payment (receipt), either as a screenshot or photo.


FAQs

What image formats does Fin Vision support?

Fin Vision supports standard image formats including JPG, PNG, and GIF files shared by customers.

How does Fin handle privacy and sensitive information in images?

Fin is designed with privacy in mind. The vision models are explicitly prompted not to extract any personal or sensitive information from images, such as credit card numbers, CVVs, or identification details. Additionally, images are stored temporarily and are automatically deleted after a short period.

Does Fin store images?

Images are temporarily stored in a secure cloud environment and automatically deleted after a short period.

Do customers need to send images in a certain way?

No, customers can upload or paste images into the chat or email. Fin handles the rest.

Can customers send multiple images?

Yes, Fin will analyze each image individually and use the context to inform responses.

Does Fin generate or send images?

No, Fin cannot generate images when providing AI responses. It can only analyze images sent by customers, or send static images via Custom Answers.

Does Fin Vision support multiple languages?

Yes, Fin can extract text from images in many languages, though accuracy depends on clarity and complexity.

Can I turn off Fin Vision?

No, Fin Vision is built-in and cannot be disabled. It operates automatically as part of Fin’s understanding of conversations.


💡Tip

Need more help? Get support from our Community Forum
Find answers and get help from Intercom Support and Community Experts


Did this answer your question?