Header Image AI 绘画，创意无限

文生图能力

学习目标

理解文本到图像生成的基本原理
掌握DALL-E 3的使用方法
学习提示词工程在图像生成中的应用
了解图像生成的最佳实践

先修知识

OpenAI API的基本使用
基本的图像处理概念
Python编程基础

1. 文生图概述

1.1 什么是文生图

文生图（Text-to-Image）是利用人工智能将文本描述转换为视觉图像的技术。它能够：

理解自然语言描述
生成符合描述的图像
保持风格一致性
处理复杂的视觉元素

1.2 应用场景

创意设计和艺术创作
产品原型展示
教育资料制作
营销内容生成
游戏资产创建

2. DALL-E 3实现

2.1 基础图像生成器

class ImageGenerator:
    """图像生成器基类"""
    
    def __init__(self, model="dall-e-3"):
        """初始化图像生成器
        
        Args:
            model (str): 使用的模型名称
        """
        self.model = model
        self.style_presets = {
            "realistic": "照片级真实风格",
            "artistic": "艺术创意风格",
            "cartoon": "卡通动漫风格",
            "sketch": "素描手绘风格"
        }
    
    def generate_image(self, prompt: str, style: str = "realistic", size: str = "1024x1024") -> str:
        """根据提示词生成图像
        
        Args:
            prompt (str): 图像描述提示词
            style (str): 图像风格，可选值：realistic, artistic, cartoon, sketch
            size (str): 图像尺寸，可选值：1024x1024, 1024x1792, 1792x1024
            
        Returns:
            str: 生成的图像URL
        """
        # 添加风格描述
        style_desc = self.style_presets.get(style, "")
        enhanced_prompt = f"{prompt}\nStyle: {style_desc}" if style_desc else prompt
        
        try:
            response = openai.Image.create(
                model=self.model,
                prompt=enhanced_prompt,
                size=size,
                quality="hd",
                n=1
            )
            return response.data[0].url
        except Exception as e:
            print(f"图像生成失败: {str(e)}")
            return None

    def generate_variations(self, image_path: str, n: int = 3) -> list:
        """生成图像变体
        
        Args:
            image_path (str): 原始图像路径
            n (int): 生成变体的数量
            
        Returns:
            list: 生成的变体图像URL列表
        """
        try:
            with open(image_path, "rb") as image_file:
                response = openai.Image.create_variation(
                    image=image_file,
                    n=n,
                    size="1024x1024"
                )
                return [data.url for data in response.data]
        except Exception as e:
            print(f"变体生成失败: {str(e)}")
            return []

    def edit_image(self, image_path: str, mask_path: str, prompt: str) -> str:
        """编辑图像
        
        Args:
            image_path (str): 原始图像路径
            mask_path (str): 蒙版图像路径
            prompt (str): 编辑描述
            
        Returns:
            str: 生成的图像URL
        """
        try:
            with open(image_path, "rb") as image_file, open(mask_path, "rb") as mask_file:
                response = openai.Image.create_edit(
                    image=image_file,
                    mask=mask_file,
                    prompt=prompt,
                    n=1,
                    size="1024x1024"
                )
                return response.data[0].url
        except Exception as e:
            print(f"图像编辑失败: {str(e)}")
            return None

2.2 使用示例

# 初始化图像生成器
generator = ImageGenerator()

# 生成图像
prompt = """
A serene landscape with a mountain lake at sunset, 
featuring snow-capped peaks reflected in crystal clear water. 
The sky is painted in vibrant oranges and purples, 
with a few wispy clouds catching the last rays of sunlight.
"""

# 生成不同风格的图像
styles = ["realistic", "artistic", "cartoon", "sketch"]
for style in styles:
    image_url = generator.generate_image(prompt, style=style)
    print(f"{style.capitalize()} style image: {image_url}")

# 生成图像变体
original_image = "landscape.png"
variations = generator.generate_variations(original_image)
for i, url in enumerate(variations):
    print(f"Variation {i+1}: {url}")

# 编辑图像
edit_prompt = "Add a wooden cabin by the lake"
edited_url = generator.edit_image("landscape.png", "mask.png", edit_prompt)
print(f"Edited image: {edited_url}")

图像生成示例

1. 现代工作空间（扁平设计）

提示词：

Digital illustration in Flat Design style of a modern workspace with a desk, laptop, and plants, 
using clean lines and bold colors. The scene is well-lit with natural lighting from a large window. 
Minimalist aesthetic with a color palette of soft blues and warm grays.

现代工作空间

2. 智能城市（等距风格）

提示词：

Isometric illustration of a smart city concept with modern buildings, flying vehicles, 
and green energy solutions. Clean geometric shapes, consistent 120-degree angles, 
bold color palette with tech-inspired blues and whites.

智能城市

3. 日本花园（水彩风格）

提示词：

Watercolor illustration of a serene Japanese garden in spring, 
with blooming cherry blossoms, a small wooden bridge over a koi pond, 
and traditional stone lanterns. Soft, flowing colors with visible brush strokes.

日本花园

4. 数据安全仪表盘

提示词：

Here is the flat illustration of a professional discussing data security on a futuristic dashboard, 
featuring a gradient color scheme of deep blue, soft purple, and white highlights.

数据安全仪表盘

5. 圣诞树（迪士尼风格）

提示词：

A cute 3D-rendered Christmas tree with bright, vibrant colors in a minimalistic Disney-inspired style. 
Featuring simple, rounded shapes, soft lighting, and a cheerful festive atmosphere on a clean background.

圣诞树

6. 礼物盒（迪士尼风格）

提示词：

A cute 3D-rendered gift box with bright, playful colors in a simple, Disney-inspired style. 
Designed with smooth, rounded edges, soft lighting, and a festive, minimalistic appearance, 
set against a clean background.

礼物盒

提示词编写技巧

从上面的示例中，我们可以总结出一些编写高质量提示词的技巧：

明确风格：在提示词中明确指定想要的艺术风格，如扁平设计、等距风格、水彩风格等。
详细描述视觉元素：
- 形状：如"clean lines"、"geometric shapes"、"rounded edges"
- 颜色：如"soft blues"、"warm grays"、"deep blue, soft purple"
- 光照：如"well-lit"、"soft lighting"
- 氛围：如"minimalist aesthetic"、"cheerful festive atmosphere"
指定技术细节：
- 渲染方式：如"3D-rendered"
- 特定角度：如"120-degree angles"
- 背景要求：如"clean background"
参考知名风格：如"Disney-inspired style"可以快速传达特定的视觉风格。
空间布局：明确说明主要元素的位置和关系，如"in the bottom right corner"、"set against"等。

代码实现

完整的代码实现请参考 generate_images.py 文件。主要步骤包括：

初始化 OpenAI 客户端
准备详细的提示词
调用 DALL-E 3 API 生成图像
保存生成的图像到本地

关键代码片段：

def generate_image(prompt: str, size: str = "1024x1024") -> str:
    """生成图像并返回URL"""
    try:
        response = requests.post(
            f"{self.api_base}/images/generations",
            headers=self.headers,
            json={
                "model": "dall-e-3",
                "prompt": prompt,
                "n": 1,
                "size": size,
                "quality": "standard"
            }
        )
        
        if response.status_code != 200:
            print(f"API错误: {response.text}")
            return None
            
        return response.json()['data'][0]['url']
        
    except Exception as e:
        print(f"生成图像时出错: {str(e)}")
        return None

注意事项

提示词要尽可能详细和具体
指定清晰的视觉风格和技术要求
注意图像的分辨率和质量设置
处理好 API 调用的错误情况
及时保存生成的图像，因为图像 URL 是临时的

3. 提示词工程

3.1 提示词结构

一个好的图像生成提示词应包含：

主体描述（Subject）：要生成的主要对象
场景环境（Environment）：背景和环境描述
风格指定（Style）：艺术风格和视觉效果
技术参数（Technical）：相机角度、光照等
细节补充（Details）：额外的细节要求

3.2 常用提示词模板

3.2.1 场景类

现代工作空间

Digital illustration in Flat Design style of a modern workspace with a desk, laptop, and plants, 
using clean lines and bold colors. The scene is well-lit with natural lighting from a large window. 
Minimalist aesthetic with a color palette of soft blues and warm grays.

自然风景

A breathtaking landscape photograph of a mountain range at golden hour, captured with a wide-angle lens. Dramatic clouds catch the warm sunlight, creating a stunning array of orange and purple hues. Sharp details in the foreground rocks and a silky smooth lake reflection. Shot at f/11, ISO 100.

建筑设计

Architectural visualization of a contemporary minimalist house, featuring clean geometric shapes and large glass windows. The structure seamlessly integrates with its natural surroundings. Rendered in a photorealistic style with careful attention to material textures and environmental lighting.

产品展示

Professional product photography of a sleek smartphone on a white background. Top-down view with soft, even lighting and subtle shadows. The device displays a vibrant app interface. Captured with a macro lens to highlight the premium materials and craftsmanship.

3.2.2 风格类

扁平设计

Flat design illustration with:
- Clean, minimalist shapes
- Bold, solid colors
- Simple geometric patterns
- No gradients or shadows
- 2D perspective
Style: Modern and minimal

等距风格

Isometric design featuring:
- 3D objects at 120-degree angles
- Precise geometric shapes
- Consistent perspective
- Bold color palette
- Clear outlines
Style: Technical and architectural

水彩风格

Watercolor illustration with:
- Soft, flowing colors
- Visible brush strokes
- Color bleeding effects
- Natural paper texture
- Organic shapes
Style: Artistic and expressive

3.3 提示词增强技巧

风格强化

Add style-specific keywords:
- Photorealistic: "hyperrealistic, 8k, detailed textures"
- Artistic: "impressionist style, vibrant brushstrokes"
- Digital: "vector art, clean lines, gradient colors"

光照描述

Enhance lighting details:
- Natural: "golden hour sunlight, soft shadows"
- Studio: "three-point lighting setup, rim light"
- Dramatic: "high contrast, dramatic shadows"

构图要素

Specify composition:
- Rule of thirds
- Leading lines
- Symmetrical balance
- Dynamic perspective
- Depth of field

4. 高级技巧

4.1 图像质量优化

分辨率和细节

使用"high resolution"、"8K"等关键词
指定"detailed"、"intricate"等描述
添加"sharp focus"、"crisp details"

光照和氛围

描述光源类型和方向
指定阴影的软硬程度
添加环境光效果

材质和纹理

详细描述表面特性
指定反射和透明度
添加微观细节

4.2 常见问题解决

图像模糊

添加以下关键词：
- "sharp focus"
- "crystal clear"
- "4K resolution"
- "detailed"

构图不佳

指定构图要素：
- "centered composition"
- "rule of thirds"
- "balanced layout"
- "dynamic angle"

风格不一致

统一风格描述：
- 明确艺术风格
- 保持一致的渲染方式
- 指定参考作品

4.3 进阶技巧

负面提示词

指定不想要的元素
避免特定的风格
排除不需要的效果

权重控制

使用括号增加权重 (keyword)
使用多重括号提高优先级 ((keyword))
使用数字设置具体权重 (keyword:1.5)

组合提示词

def combine_prompts(*prompts, weights=None):
    """组合多个提示词
    
    Args:
        prompts: 提示词列表
        weights: 对应的权重列表
    
    Returns:
        str: 组合后的提示词
    """
    if weights is None:
        weights = [1.0] * len(prompts)
    
    weighted_prompts = [
        f"({prompt}:{weight})"
        for prompt, weight in zip(prompts, weights)
    ]
    
    return ", ".join(weighted_prompts)

4.4 工作流程优化

迭代生成

从简单提示词开始
逐步添加细节
保存成功的提示词
记录失败的尝试

批量生成

准备提示词变体
使用不同参数
比较结果
选择最佳输出

质量控制

建立评估标准
使用检查清单
收集用户反馈
持续优化流程

练习与作业

基础练习：生成不同风格的风景图
进阶练习：使用图像编辑功能创作
挑战练习：开发一个主题图像生成系统

常见问题

Q1: 如何提高生成图像的质量？ A1: 提供详细的描述，指定具体的风格和参数，使用高质量设置。

Q2: 为什么有些图像生成失败？ A2: 可能是提示词不当、内容限制或API限制导致。

文生图能力

学习目标

先修知识

1. 文生图概述

1.1 什么是文生图

1.2 应用场景

2. DALL-E 3实现

2.1 基础图像生成器

2.2 使用示例

图像生成示例

1. 现代工作空间（扁平设计）

2. 智能城市（等距风格）

3. 日本花园（水彩风格）

4. 数据安全仪表盘

5. 圣诞树（迪士尼风格）

6. 礼物盒（迪士尼风格）

提示词编写技巧

代码实现

注意事项

3. 提示词工程

3.1 提示词结构

3.2 常用提示词模板

3.2.1 场景类

3.2.2 风格类

3.3 提示词增强技巧

4. 高级技巧

4.1 图像质量优化

4.2 常见问题解决

4.3 进阶技巧

4.4 工作流程优化

练习与作业

常见问题

扩展阅读

目录