推荐 最新
七厦

为什么安装 cuda_12.2.2_535.104.05_linux.run 之后,还是没有 nvcc 命令?

我的机器是 nvidia T4 GPU + ubuntu22.04 我先通过下面的命令安装驱动 sudo apt install -y nvidia-driver-535-server 等电脑重启好了,输入 "nvidia-smi" 查看显卡信息 ╰─➤ nvidia-smi 130 ↵ Mon Sep 18 14:30:16 2023 +---------------------------------------------------------------------------------------+ | NVIDIA-SMI 535.54.03 Driver Version: 535.54.03 CUDA Version: 12.2 | |-----------------------------------------+----------------------+----------------------+ | GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC | | Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. | | | | MIG M. | |=========================================+======================+======================| | 0 Tesla T4 Off | 00000000:AF:00.0 Off | 0 | | N/A 47C P0 27W / 70W | 2MiB / 15360MiB | 6% Default | | | | N/A | +-----------------------------------------+----------------------+----------------------+ 然后在 "https://developer.nvidia.com/cuda-downloads?target_os=Linux&t..." (https://link.segmentfault.com/?enc=3lX4HpTDwbVmYUepMwiEig%3D%3D.hVEvpxzxrZ5nbln27HtjYX%2FZgdAL9yh2fVEKf%2BzBsut2FoxAl2GIprcrELn%2BXB3k4tuTTUNgH4yajNzLt5aX7MPvOenGOadnQHI0WBRAFLmmG6vebB5O0RH%2BJQ6JGCmbZ8nOl2AYWSzYeYL1qUJqUZfS29AFy55ZR5t1WOE1DY7jLNViJlaGZUKzxbx8L4omQCX1vFuTS5EcMlgK8i5cwQ%3D%3D) 下面 CUDA Toolkit 12.2 "图片.png" (https://wmprod.oss-cn-shanghai.aliyuncs.com/images/20241227/df6ec5e1bab5f00a3335f3e1a32d40fb.png) ╭─pon@T4GPU ~/Downloads ╰─➤ sudo sh cuda_12.2.2_535.104.05_linux.run [sudo] password for pon: 安装之后,还是没有 nvcc ╭─pon@T4GPU ~/Downloads ╰─➤ nvcc --version 127 ↵ zsh: command not found: nvcc ╭─pon@T4GPU ~/Downloads ╰─➤ cd / 127 ↵ ╭─pon@T4GPU / ╰─➤ fd -a -u nvcc /usr/share/cmake-3.22/Modules/FindCUDA/run_nvcc.cmake 我的期望是,安装这个 CUDA Toolkit 之后,就有 nvcc 命令

14
1
0
浏览量494
时光旅人

现有两开关电源,分别为反激式(脉冲调制)、固定负载功率式(非脉冲调制),规定采样频率<500KHz,且两种电源功率同等(如200W)的情况下,在220V端 是否可以通过电压或电流波形区分上述两种电源?

问题背景: 现有两开关电源,分别为反激式(脉冲调制)、固定负载功率式(非脉冲调制), 脉冲调制型典型代表为:各类电池充电器,固定负载功率式典型代表为:射灯;这两种负 载分别接入市电。 问题描述: 规定采样频率<500KHz,且两种电源功率同等(如200W)的情况下,在220V端 是否可以通过电压或电流波形区分上述两种电源? 如果可以,请详细描述需要区分的特征点,如果不可以,是否可以通过其他途径 区分?

10
1
0
浏览量363
一颗西兰花

TypeError: 'Namespace' object is not iterable?

for idx, (img_a, att_a, c_org) in enumerate(test_dataloader): 我在调试这段代码的时候出现了bug,请问该怎么解决?"这是test_dataloader" (https://wmprod.oss-cn-shanghai.aliyuncs.com/images/20241228/fff819831c34022cdc7d3afae7c2502f.png)https://wmprod.oss-cn-shanghai.aliyuncs.com/images/20241228/ce470333719a973619f0cea1dc1de645.png

9
1
0
浏览量352
清晨我上码

Pytorch10天入门-day10-模型部署&推理

模型部署&推理模型部署模型推理我们会将PyTorch训练好的模型转换为ONNX 格式,然后使用ONNX Runtime运行它进行推理1、ONNXONNX( Open Neural Network Exchange) 是 Facebook (现Meta) 和微软在2017年共同发布的,用于标准描述计算图的一种格式。ONNX通过定义一组与环境和平台无关的标准格式,使AI模型可以在不同框架和环境下交互使用,ONNX可以看作深度学习框架和部署端的桥梁,就像编译器的中间语言一样由于各框架兼容性不一,我们通常只用 ONNX 表示更容易部署的静态图。硬件和软件厂商只需要基于ONNX标准优化模型性能,让所有兼容ONNX标准的框架受益ONNX主要关注在模型预测方面,使用不同框架训练的模型,转化为ONNX格式后,可以很容易的部署在兼容ONNX的运行环境中2、ONNX RuntimeONNX Runtime官网:https://www.onnxruntime.ai/ONNX Runtime GitHub:https://github.com/microsoft/onnxruntimeONNX Runtime 是由微软维护的一个跨平台机器学习推理加速器,它直接对接ONNX,可以直接读取.onnx文件并实现推理,不需要再把 .onnx 格式的文件转换成其他格式的文件PyTorch借助ONNX Runtime也完成了部署的最后一公里,构建了 PyTorch --> ONNX --> ONNX Runtime 部署流水线ONNX Runtime和CUDA之间的适配关系ONNX Runtime、TensorRT和CUDA的匹配关系:3、模型转换为ONNX格式用torch.onnx.export()把模型转换成 ONNX 格式的函数模型导成onnx格式前,我们必须调用model.eval()或者model.train(False)以确保我们的模型处在推理模式下import torch.onnx # 转换的onnx格式的名称,文件后缀需为.onnxonnx_file_name = "resnet50.onnx" # 我们需要转换的模型,将torch_model设置为自己的模型model = torchvision.models.resnet50(pretrained=True) # 加载权重,将model.pth转换为自己的模型权重model = model.load_state_dict(torch.load("resnet50.pt")) # 导出模型前,必须调用model.eval()或者model.train(False)model.eval() # dummy_input就是一个输入的实例,仅提供输入shape、type等信息 batch_size = 1 # 随机的取值,当设置dynamic_axes后影响不大dummy_input = torch.randn(batch_size, 3, 224, 224, requires_grad=True) # 这组输入对应的模型输出output = model(dummy_input) # 导出模型torch.onnx.export(model, # 模型的名称dummy_input, # 一组实例化输入onnx_file_name, # 文件保存路径/名称export_params=True, # 如果指定为True或默认, 参数也会被导出. 如果你要导出一个没训练过的就设为 False.opset_version=10, # ONNX 算子集的版本,当前已更新到15do_constant_folding=True, # 是否执行常量折叠优化input_names = ['conv1'], # 输入模型的张量的名称output_names = ['fc'], # 输出模型的张量的名称# dynamic_axes将batch_size的维度指定为动态, # 后续进行推理的数据可以与导出的dummy_input的batch_size不同dynamic_axes={'conv1' : {0 : 'batch_size'}, 'fc' : {0 : 'batch_size'}})ONNX模型的检验我们需要检测下我们的模型文件是否可用,我们将通过onnx.checker.check_model()进行检验import onnx # 我们可以使用异常处理的方法进行检验try: # 当我们的模型不可用时,将会报出异常onnx.checker.check_model(self.onnx_model) except onnx.checker.ValidationError as e: print("The model is invalid: %s"%e) else: # 模型可用时,将不会报出异常,并会输出“The model is valid!”print("The model is valid!")ONNX模型可视化使用netron做可视化。下载地址:https://netron.app/模型的输入&输出信息:使用ONNX Runtime进行推理使用ONNX Runtime进行推理import onnxruntime # 需要进行推理的onnx模型文件名称onnx_file_name= "xxxxxx.onnx" # onnxruntime.InferenceSession用于获取一个 ONNX Runtime 推理器ort_session= onnxruntime.InferenceSession(onnx_file_name, providers=['CPUExecutionProvider']) # session_fp32 = onnxruntime.InferenceSession("resnet50.onnx", providers=['CUDAExecutionProvider'])# session_fp32 = onnxruntime.InferenceSession("resnet50.onnx", providers=['OpenVINOExecutionProvider'])# 构建字典的输入数据,字典的key需要与我们构建onnx模型时的input_names相同# 输入的input_img 也需要改变为ndarray格式# ort_inputs = {'conv_1': input_img}#建议使用下面这种方法,因为避免了手动输入keyort_inputs= {ort_session.get_inputs()[0].name:input_img} # run是进行模型的推理,第一个参数为输出张量名的列表,一般情况可以设置为None# 第二个参数为构建的输入值的字典# 由于返回的结果被列表嵌套,因此我们需要进行[0]的索引ort_output= ort_session.run(None,ort_inputs)[0] # output = {ort_session.get_outputs()[0].name}# ort_output = ort_session.run([output], ort_inputs)[0]注意:PyTorch模型的输入为tensor,而ONNX的输入为array,因此我们需要对张量进行变换或者直接将数据读取为array格式输入的array的shape应该和我们导出模型的dummy_input的shape相同,如果图片大小不一样,我们应该先进行resize操作run的结果是一个列表,我们需要进行索引操作才能获得array格式的结果在构建输入的字典时,我们需要注意字典的key应与导出ONNX格式设置的input_name相同完整代码1. 安装&下载#!pip install onnx -i https://pypi.tuna.tsinghua.edu.cn/simple #!pip install onnxruntime -i https://pypi.tuna.tsinghua.edu.cn/simple #!pip install torch -i https://pypi.tuna.tsinghua.edu.cn/simple # Download ImageNet labels #!wget https://raw.githubusercontent.com/pytorch/hub/master/imagenet_classes.txt2、定义模型import torch import io import time from PIL import Image import torchvision.transforms as transforms from torchvision import datasets import onnx import onnxruntime import torchvision import numpy as np from torch import nn import torch.nn.init as initonnx_file = 'resnet50.onnx' save_dir = './resnet50.pt'# 下载预训练模型 Resnet50 = torchvision.models.resnet50(pretrained=True) # 保存 模型权重 torch.save(Resnet50.state_dict(), save_dir) print(Resnet50)3. 模型导出为ONNX格式batch_size = 1 # just a random number # 先加载模型结构 loaded_model = torchvision.models.resnet50() # 在加载模型权重 loaded_model.load_state_dict(torch.load(save_dir)) #单卡GPU # loaded_model.cuda() # 将模型设置为推理模式 loaded_model.eval() # Input to the model x = torch.randn(batch_size, 3, 224, 224, requires_grad=True) torch_out = loaded_model(x) torch_out# 导出模型 torch.onnx.export(loaded_model, # model being run x, # model input (or a tuple for multiple inputs) onnx_file, # where to save the model (can be a file or file-like object) export_params=True, # store the trained parameter weights inside the model file opset_version=10, # the ONNX version to export the model to do_constant_folding=True, # whether to execute constant folding for optimization input_names = ['conv1'], # the model's input names output_names = ['fc'], # the model's output names # variable length axes dynamic_axes={'conv1' : {0 : 'batch_size'}, 'fc' : {0 : 'batch_size'}})============= Diagnostic Run torch.onnx.export version 2.0.0+cu117 ============= verbose: False, log level: Level.ERROR ======================= 0 NONE 0 NOTE 0 WARNING 0 ERROR ========================4、检验ONNX模型# 我们可以使用异常处理的方法进行检验 try: # 当我们的模型不可用时,将会报出异常 onnx.checker.check_model(onnx_file) except onnx.checker.ValidationError as e: print("The model is invalid: %s"%e) else: # 模型可用时,将不会报出异常,并会输出“The model is valid!” print("The model is valid!")5. 使用ONNX Runtime进行推理import onnxruntime import numpy as np ort_session = onnxruntime.InferenceSession(onnx_file, providers=['CPUExecutionProvider']) # 将张量转化为ndarray格式 def to_numpy(tensor): return tensor.detach().cpu().numpy() if tensor.requires_grad else tensor.cpu().numpy() # 构建输入的字典和计算输出结果 ort_inputs = {ort_session.get_inputs()[0].name: to_numpy(x)} ort_outs = ort_session.run(None, ort_inputs) # 比较使用PyTorch和ONNX Runtime得出的精度 np.testing.assert_allclose(to_numpy(torch_out), ort_outs[0], rtol=1e-03, atol=1e-05) print("Exported model has been tested with ONNXRuntime, and the result looks good!")6. 进行实际预测并可视化# 推理数据 from PIL import Image from torchvision.transforms import transforms # 生成推理图片 image = Image.open('./images/cat.jpg') # 将图像调整为指定大小 image = image.resize((224, 224)) # 将图像转换为 RGB 模式 image = image.convert('RGB') image.save('./images/cat_224.jpg')categories = [] # Read the categories with open("./imagenet/imagenet_classes.txt", "r") as f: categories = [s.strip() for s in f.readlines()] def get_class_name(probabilities): # Show top categories per image top5_prob, top5_catid = torch.topk(probabilities, 5) for i in range(top5_prob.size(0)): print(categories[top5_catid[i]], top5_prob[i].item())#预处理 def pre_image(image_file): input_image = Image.open(image_file) preprocess = transforms.Compose([ transforms.Resize(256), transforms.CenterCrop(224), transforms.ToTensor(), transforms.Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225]), ]) input_tensor = preprocess(input_image) inputs = input_tensor.unsqueeze(0) # create a mini-batch as expected by the model # input_arr = inputs.cpu().detach().numpy() return inputs#inference with model # 先加载模型结构 resnet50 = torchvision.models.resnet50() # 在加载模型权重 resnet50.load_state_dict(torch.load(save_dir)) resnet50.eval() #推理 input_batch = pre_image('./images/cat_224.jpg') # move the input and model to GPU for speed if available print("GPU Availability: ", torch.cuda.is_available()) if torch.cuda.is_available(): input_batch = input_batch.to('cuda') resnet50.to('cuda') with torch.no_grad(): output = resnet50(input_batch) # Tensor of shape 1000, with confidence scores over Imagenet's 1000 classes # print(output[0]) # The output has unnormalized scores. To get probabilities, you can run a softmax on it. probabilities = torch.nn.functional.softmax(output[0], dim=0) get_class_name(probabilities)GPU Availability: False Persian cat 0.6668420433998108 lynx 0.023987364023923874 bow tie 0.016234245151281357 hair slide 0.013150070793926716 Japanese spaniel 0.012279157526791096#benchmark 性能 latency = [] for i in range(10): with torch.no_grad(): start = time.time() output = resnet50(input_batch) probabilities = torch.nn.functional.softmax(output[0], dim=0) top5_prob, top5_catid = torch.topk(probabilities, 5) # for catid in range(top5_catid.size(0)): # print(categories[catid]) latency.append(time.time() - start) print("{} model inference CPU time:cost {} ms".format(str(i),format(sum(latency) * 1000 / len(latency), '.2f')))0 model inference CPU time:cost 149.59 ms 1 model inference CPU time:cost 130.74 ms 2 model inference CPU time:cost 133.76 ms 3 model inference CPU time:cost 130.64 ms 4 model inference CPU time:cost 131.72 ms 5 model inference CPU time:cost 130.88 ms 6 model inference CPU time:cost 136.31 ms 7 model inference CPU time:cost 139.95 ms 8 model inference CPU time:cost 141.90 ms 9 model inference CPU time:cost 140.96 ms# Inference with ONNX Runtime import onnxruntime from onnx import numpy_helper import time onnx_file = 'resnet50.onnx' session_fp32 = onnxruntime.InferenceSession(onnx_file, providers=['CPUExecutionProvider']) # session_fp32 = onnxruntime.InferenceSession("resnet50.onnx", providers=['CUDAExecutionProvider']) # session_fp32 = onnxruntime.InferenceSession("resnet50.onnx", providers=['OpenVINOExecutionProvider']) def softmax(x): """Compute softmax values for each sets of scores in x.""" e_x = np.exp(x - np.max(x)) return e_x / e_x.sum() latency = [] def run_sample(session, categories, inputs): start = time.time() input_arr = inputs ort_outputs = session.run([], {'conv1':input_arr})[0] output = ort_outputs.flatten() output = softmax(output) # this is optional top5_catid = np.argsort(-output)[:5] # for catid in top5_catid: # print(categories[catid]) latency.append(time.time() - start) return ort_outputsinput_tensor = pre_image('./images/cat_224.jpg') input_arr = input_tensor.cpu().detach().numpy() for i in range(10): ort_output = run_sample(session_fp32, categories, input_arr) print("{} ONNX Runtime CPU Inference time = {} ms".format(str(i),format(sum(latency) * 1000 / len(latency), '.2f')))0 ONNX Runtime CPU Inference time = 67.66 ms 1 ONNX Runtime CPU Inference time = 56.30 ms 2 ONNX Runtime CPU Inference time = 53.90 ms 3 ONNX Runtime CPU Inference time = 58.18 ms 4 ONNX Runtime CPU Inference time = 64.53 ms 5 ONNX Runtime CPU Inference time = 62.79 ms 6 ONNX Runtime CPU Inference time = 61.75 ms 7 ONNX Runtime CPU Inference time = 60.51 ms 8 ONNX Runtime CPU Inference time = 59.35 ms 9 ONNX Runtime CPU Inference time = 57.57 ms4、扩展知识模型量化模型剪裁工程优化算子优化

0
0
0
浏览量2030
清晨我上码

Pytorch10天入门-day07-模型保存与读取

PyTorch 模型保存&读取模型存储模型单卡存储&多卡存储模型单卡读取&多卡读取1、模型存储PyTorch存储模型主要采用pkl,pt,pth三种格式,就使用层面来说没有区别PyTorch模型主要包含两个部分:模型结构和权重。其中模型是继承nn.Module的类,权重的数据结构是一个字典(key是层名,value是权重向量)存储也由此分为两种形式:存储整个模型(包括结构和权重)和只存储模型权重(推荐)。import torch from torchvision import models model = models.resnet50(pretrained=True) save_dir = './resnet50.pth' # 保存整个 模型结构+权重 torch.save(model, save_dir) # 保存 模型权重 torch.save(model.state_dict, save_dir) # pt, pth和pkl三种数据格式均支持模型权重和整个模型的存储2、模型单卡存储&多卡存储PyTorch中将模型和数据放到GPU上有两种方式——.cuda()和.to(device)注:如果要使用多卡训练的话,需要对模型使用torch.nn.DataParallel2.1、nn.DataParrallel<CLASS torch.nn.DataParallel(module, device_ids=None, output_device=None, dim=0)>module即表示你定义的模型device_ids表示你训练的deviceoutput_device这个参数表示输出结果的device,而这最后一个参数output_device一般情况下是省略不写的,那么默认就是在device_ids[0]注:因此一般情况下第一张显卡的内存使用占比会更多import os import torch from torchvision import models#单卡 os.environ['CUDA_VISIBLE_DEVICES'] = '0' # 如果是多卡改成类似0,1,2 model = model.cuda() # 单卡 #print(model)#多卡 os.environ['CUDA_VISIBLE_DEVICES'] = '0,1' model = torch.nn.DataParallel(model).cuda() # 多卡 #print(model)2.3、单卡保存+单卡加载os.environ['CUDA_VISIBLE_DEVICES'] = '0' #这里替换成希望使用的GPU编号 model = models.resnet50(pretrained=True) model.cuda() save_dir = 'resnet50.pt' #保存路径 # 保存+读取整个模型 torch.save(model, save_dir) loaded_model = torch.load(save_dir) loaded_model.cuda() # 保存+读取模型权重 torch.save(model.state_dict(), save_dir) # 先加载模型结构 loaded_model = models.resnet50() # 在加载模型权重 loaded_model.load_state_dict(torch.load(save_dir)) loaded_model.cuda()2.4、单卡保存+多卡加载os.environ['CUDA_VISIBLE_DEVICES'] = '0' #这里替换成希望使用的GPU编号 model = models.resnet50(pretrained=True) model.cuda() # 保存+读取整个模型 torch.save(model, save_dir) os.environ['CUDA_VISIBLE_DEVICES'] = '1,2' #这里替换成希望使用的GPU编号 loaded_model = torch.load(save_dir) loaded_model = nn.DataParallel(loaded_model).cuda() # 保存+读取模型权重 torch.save(model.state_dict(), save_dir) os.environ['CUDA_VISIBLE_DEVICES'] = '1,2' #这里替换成希望使用的GPU编号 loaded_model = models.resnet50() #注意这里需要对模型结构有定义 loaded_model.load_state_dict(torch.load(save_dir)) loaded_model = nn.DataParallel(loaded_model).cuda()2.5、多卡保存+单卡加载核心问题:如何去掉权重字典键名中的"module",以保证模型的统一性对于加载整个模型,直接提取模型的module属性即可对于加载模型权重,保存模型时保存模型的module属性对应的权重os.environ['CUDA_VISIBLE_DEVICES'] = '1,2' #这里替换成希望使用的GPU编号 model = models.resnet50(pretrained=True) model = nn.DataParallel(model).cuda() # 保存+读取整个模型 torch.save(model, save_dir) os.environ['CUDA_VISIBLE_DEVICES'] = '0' #这里替换成希望使用的GPU编号 loaded_model = torch.load(save_dir).moduleos.environ['CUDA_VISIBLE_DEVICES'] = '0,1,2' #这里替换成希望使用的GPU编号 model = models.resnet50(pretrained=True) model = nn.DataParallel(model).cuda() # 保存权重 torch.save(model.module.state_dict(), save_dir) #加载模型权重 os.environ['CUDA_VISIBLE_DEVICES'] = '0' #这里替换成希望使用的GPU编号 loaded_model = models.resnet50() #注意这里需要对模型结构有定义 loaded_model.load_state_dict(torch.load(save_dir)) loaded_model.cuda()2.6、多卡保存+多卡加载保存整个模型时会同时保存所使用的GPU id等信息,读取时若这些信息和当前使用的GPU信息不符则可能会报错或者程序不按预定状态运行。可能出现以下2个问题:1、读取整个模型再使用nn.DataParallel进行分布式训练设置,这种情况很可能会造成保存的整个模型中GPU id和读取环境下设置的GPU id不符,训练时数据所在device和模型所在device不一致而报错2、读取整个模型而不使用nn.DataParallel进行分布式训练设置,发现程序会自动使用设备的前n个GPU进行训练(n是保存的模型使用的GPU个数)。此时如果指定的GPU个数少于n,则会报错建议方案:只模型权重,之后再使用nn.DataParallel进行分布式训练设置则没有问题因此多卡模式下建议使用权重的方式存储和读取模型os.environ['CUDA_VISIBLE_DEVICES'] = '0,1,2' #这里替换成希望使用的GPU编号 model = models.resnet50(pretrained=True) model = nn.DataParallel(model).cuda() # 保存+读取模型权重,强烈建议!! torch.save(model.state_dict(), save_dir) #加载模型 权重 loaded_model = models.resnet50() #注意这里需要对模型结构有定义 loaded_model.load_state_dict(torch.load(save_dir))) loaded_model = nn.DataParallel(loaded_model).cuda()建议不管是单卡保存还是多卡保存,建议以保存模型权重为主不管是单卡还是多卡,先load模型权重,再指定是多卡加载(nn.DataParallel)或单卡(cuda)# 使用案例(截取片段代码) My_model.eval() test_total_loss = 0 test_total_correct = 0 test_total_num = 0 past_test_loss = 0 #上一轮的loss save_model_step = 10 # 每10步保存一次model for iter,(images,labels) in enumerate(test_loader): images = images.to(device) labels = labels.to(device) outputs = My_model(images) loss = criterion(outputs,labels) test_total_correct += (outputs.argmax(1) == labels).sum().item() test_total_loss += loss.item() test_total_num += labels.shape[0] test_loss = test_total_loss / test_total_num print("Epoch [{}/{}], train_loss:{:.4f}, train_acc:{:.4f}%, test_loss:{:.4f}, test_acc:{:.4f}%".format( i+1, epoch, train_total_loss / train_total_num, train_total_correct / train_total_num * 100, test_total_loss / test_total_num, test_total_correct / test_total_num * 100 )) # model save if test_loss<past_test_loss: #保存模型权重 torch.save(model.state_dict(), save_dir) #保存 模型权重+模型结构 #torch.save(model, save_dir) if iter % save_model_step == 0: #保存模型权重 torch.save(model.state_dict(), save_dir) #保存 模型权重+模型结构 #torch.save(model, save_dir) past_test_loss = test_loss

0
0
0
浏览量2022
清晨我上码

Pytorch10天入门-day05-可视化

PyTorch 可视化1、模型结构可视化2、训练过程可视化3、模型评估可视化#导入常用包 import os import numpy as np import torch from torch import nn from torch.utils.data import Dataset, DataLoader from torchvision.transforms import transforms import torchvisionimport torch.nn.functional as F # 自定义model class DemoModel(nn.Module): def __init__(self): super(DemoModel, self).__init__() self.conv1 = nn.Conv2d(3, 6, 5) self.pool = nn.MaxPool2d(2, 2) self.conv2 = nn.Conv2d(6, 16, 5) self.fc1 = nn.Linear(16 * 5 * 5, 120) self.fc2 = nn.Linear(120, 84) self.fc3 = nn.Linear(84, 10) def forward(self, x): x = self.pool(F.relu(self.conv1(x))) x = self.pool(F.relu(self.conv2(x))) x = x.view(-1, 16 * 5 * 5) x = F.relu(self.fc1(x)) x = F.relu(self.fc2(x)) x = self.fc3(x) return xmodel = DemoModel() #方法一:print打印(模型结构可视化) print(model)DemoModel( (conv1): Conv2d(3, 6, kernel_size=(5, 5), stride=(1, 1)) (pool): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False) (conv2): Conv2d(6, 16, kernel_size=(5, 5), stride=(1, 1)) (fc1): Linear(in_features=400, out_features=120, bias=True) (fc2): Linear(in_features=120, out_features=84, bias=True) (fc3): Linear(in_features=84, out_features=10, bias=True) )torchinfo#pip3 install torchinfotrochinfo的使用也是十分简单,我们只需要使用torchinfo.summary()就行了,必需的参数分别是model,input_size[batch_size,channel,h,w]提供了模块信息(每一层的类型、输出shape和参数量)、模型整体的参数量、模型大小、一次前向或者反向传播需要的内存大小等from torchinfo import summary model = DemoModel() # 实例化模型 #方法二:torchinfo 查看 模型结构可视化 summary(model, (1, 3, 32, 32)) # 1:batch_size 3:图片的通道数 1024: 图片的高宽========================================================================================== Layer (type:depth-idx) Output Shape Param # ========================================================================================== DemoModel [1, 10] -- ├─Conv2d: 1-1 [1, 6, 28, 28] 456 ├─MaxPool2d: 1-2 [1, 6, 14, 14] -- ├─Conv2d: 1-3 [1, 16, 10, 10] 2,416 ├─MaxPool2d: 1-4 [1, 16, 5, 5] -- ├─Linear: 1-5 [1, 120] 48,120 ├─Linear: 1-6 [1, 84] 10,164 ├─Linear: 1-7 [1, 10] 850 ========================================================================================== Total params: 62,006 Trainable params: 62,006 Non-trainable params: 0 Total mult-adds (M): 0.66 ========================================================================================== Input size (MB): 0.01 Forward/backward pass size (MB): 0.05 Params size (MB): 0.25 Estimated Total Size (MB): 0.31 ==========================================================================================TensorBoardTensorBoard作为一款可视化工具能够满足 输入数据(尤其是图片)、模型结构、参数分布、debug的需求TensorBoard可以记录我们指定的数据,包括模型每一层的feature map,权重,以及训练loss等等利用TensorBoard实现训练过程可视化安装pip3 install tensorboard启动tensorboardtensorboard --logdir=/path/to/logs/ --port=xxxx其中“path/to/logs/"是指定的保存tensorboard记录结果的文件路径,等价于上面的“./runs"port是外部访问TensorBoard的端口号,可以通过访问ip:port访问tensorboard)# from tensorboard import SummaryWriter from torch.utils.tensorboard import SummaryWriter writer = SummaryWriter('./runs') #方法三:tensorboard查看 writer.add_graph(model,torch.rand(1, 3, 32, 32)) writer.close()tensorboard 可视图#超参数定义 # 批次的大小 batch_size = 16 #可选32、64、128 # 优化器的学习率 lr = 1e-4 #运行epoch max_epochs = 2 # 方案一:指定GPU的方式 # os.environ['CUDA_VISIBLE_DEVICES'] = '0,1' # 指明调用的GPU为0,1号 # 方案二:使用“device”,后续对要使用GPU的变量用.to(device)即可 # device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu") # 指明调用的GPU为1号 # 数据读取 #cifar10数据集为例给出构建Dataset类的方式 from torchvision import datasets #“data_transform”可以对图像进行一定的变换,如翻转、裁剪、归一化等操作,可自己定义 data_transform=transforms.Compose([ transforms.ToTensor(), transforms.Normalize((0.5,0.5,0.5),(0.5,0.5,0.5)) ]) train_cifar_dataset = datasets.CIFAR10('cifar10',train=True, download=False,transform=data_transform) test_cifar_dataset = datasets.CIFAR10('cifar10',train=False, download=False,transform=data_transform) #构建好Dataset后,就可以使用DataLoader来按批次读入数据了 train_loader = torch.utils.data.DataLoader(train_cifar_dataset, batch_size=batch_size, num_workers=4, shuffle=True, drop_last=True) test_loader = torch.utils.data.DataLoader(test_cifar_dataset, batch_size=batch_size, num_workers=4, shuffle=False)#训练&验证 writer = SummaryWriter('./runs') # Set fixed random number seed torch.manual_seed(42) # 定义损失函数和优化器 device = torch.device('cuda:0' if torch.cuda.is_available() else 'cpu') My_model = DemoModel() My_model = My_model.to(device) # 交叉熵 criterion = torch.nn.CrossEntropyLoss() # 优化器 optimizer = torch.optim.Adam(My_model.parameters(), lr=lr) epoch = max_epochs total_step = len(train_loader) train_all_loss = [] test_all_loss = [] for i in range(epoch): My_model.train() train_total_loss = 0 train_total_num = 0 train_total_correct = 0 for iter, (images,labels) in enumerate(train_loader): images = images.to(device) labels = labels.to(device) # Write the network graph at epoch 0, batch 0 if epoch == 0 and iter == 0: writer.add_graph(My_model, input_to_model=(images,labels)[0], verbose=True) # Write an image at every batch 0 if iter == 0: writer.add_image("Example input", images[0], global_step=epoch) outputs = My_model(images) loss = criterion(outputs,labels) train_total_correct += (outputs.argmax(1) == labels).sum().item() #backword optimizer.zero_grad() loss.backward() optimizer.step() train_total_num += labels.shape[0] train_total_loss += loss.item() # Print statistics writer.add_scalar("Loss/Minibatches", train_total_loss, train_total_num) print("Epoch [{}/{}], Iter [{}/{}], train_loss:{:4f}".format(i+1,epoch,iter+1,total_step,loss.item()/labels.shape[0])) # Write loss for epoch writer.add_scalar("Loss/Epochs", train_total_loss, epoch) My_model.eval() test_total_loss = 0 test_total_correct = 0 test_total_num = 0 for iter,(images,labels) in enumerate(test_loader): images = images.to(device) labels = labels.to(device) outputs = My_model(images) loss = criterion(outputs,labels) test_total_correct += (outputs.argmax(1) == labels).sum().item() test_total_loss += loss.item() test_total_num += labels.shape[0] print("Epoch [{}/{}], train_loss:{:.4f}, train_acc:{:.4f}%, test_loss:{:.4f}, test_acc:{:.4f}%".format( i+1, epoch, train_total_loss / train_total_num, train_total_correct / train_total_num * 100, test_total_loss / test_total_num, test_total_correct / test_total_num * 100 )) train_all_loss.append(np.round(train_total_loss / train_total_num,4)) test_all_loss.append(np.round(test_total_loss / test_total_num,4))lossgraph

0
0
0
浏览量2027
清晨我上码

Pytorch10天入门-day04-模型构建

PyTorch 模型构建1、GPU配置2、数据预处理3、划分训练集、验证集、测试集4、选择模型5、设定损失函数&优化方法6、模型效果评估本节主要讲4、5部分#导入常用包 import os import numpy as np import torch from torch.utils.data import Dataset, DataLoader from torchvision.transforms import transforms#超参数定义 # 批次的大小 batch_size = 16 #可选32、64、128 # 优化器的学习率 lr = 1e-4 #运行epoch max_epochs = 10 # 方案一:指定GPU的方式 os.environ['CUDA_VISIBLE_DEVICES'] = '0,1' # 指明调用的GPU为0,1号 # 方案二:使用“device”,后续对要使用GPU的变量用.to(device)即可 device = torch.device("cuda:1" if torch.cuda.is_available() else "cpu") # 指明调用的GPU为1号# 数据读取 #cifar10数据集为例给出构建Dataset类的方式 from torchvision import datasets #“data_transform”可以对图像进行一定的变换,如翻转、裁剪、归一化等操作,可自己定义 data_transform=transforms.Compose([ transforms.ToTensor(), transforms.Normalize((0.5,0.5,0.5),(0.5,0.5,0.5)) ]) train_cifar_dataset = datasets.CIFAR10('cifar10',train=True, download=False,transform=data_transform) test_cifar_dataset = datasets.CIFAR10('cifar10',train=False, download=False,transform=data_transform) #构建好Dataset后,就可以使用DataLoader来按批次读入数据了 train_loader = torch.utils.data.DataLoader(train_cifar_dataset, batch_size=batch_size, num_workers=4, shuffle=True, drop_last=True) test_loader = torch.utils.data.DataLoader(test_cifar_dataset, batch_size=batch_size, num_workers=4, shuffle=False)#定义模型 # 方法一:预训练模型 import torchvision Resnet50 = torchvision.models.resnet50(pretrained=True) Resnet50.fc.out_features=10 print(Resnet50)#训练&验证 # 定义损失函数和优化器 device = torch.device('cuda:0' if torch.cuda.is_available() else 'cpu') # 损失函数:交叉熵 criterion = torch.nn.CrossEntropyLoss() # 优化器 optimizer = torch.optim.Adam(Resnet50.parameters(), lr=lr) epoch = max_epochs Resnet50 = Resnet50.to(device) total_step = len(train_loader) train_all_loss = [] val_all_loss = [] for i in range(epoch): Resnet50.train() train_total_loss = 0 train_total_num = 0 train_total_correct = 0 for iter, (images,labels) in enumerate(train_loader): images = images.to(device) labels = labels.to(device) outputs = Resnet50(images) loss = criterion(outputs,labels) train_total_correct += (outputs.argmax(1) == labels).sum().item() #backword optimizer.zero_grad() loss.backward() optimizer.step() train_total_num += labels.shape[0] train_total_loss += loss.item() print("Epoch [{}/{}], Iter [{}/{}], train_loss:{:4f}".format(i+1,epoch,iter+1,total_step,loss.item()/labels.shape[0])) Resnet50.eval() test_total_loss = 0 test_total_correct = 0 test_total_num = 0 for iter,(images,labels) in enumerate(test_loader): images = images.to(device) labels = labels.to(device) outputs = Resnet50(images) loss = criterion(outputs,labels) test_total_correct += (outputs.argmax(1) == labels).sum().item() test_total_loss += loss.item() test_total_num += labels.shape[0] print("Epoch [{}/{}], train_loss:{:.4f}, train_acc:{:.4f}%, test_loss:{:.4f}, test_acc:{:.4f}%".format( i+1, epoch, train_total_loss / train_total_num, train_total_correct / train_total_num * 100, test_total_loss / test_total_num, test_total_correct / test_total_num * 100 )) train_all_loss.append(np.round(train_total_loss / train_total_num,4)) test_all_loss.append(np.round(test_total_loss / test_total_num,4))# 方法二:自定义model class DemoModel(nn.Module): def __init__(self): super(DemoModel, self).__init__() self.conv1 = nn.Conv2d(3, 6, 5) self.pool = nn.MaxPool2d(2, 2) self.conv2 = nn.Conv2d(6, 16, 5) self.fc1 = nn.Linear(16 * 5 * 5, 120) self.fc2 = nn.Linear(120, 84) self.fc3 = nn.Linear(84, 10) def forward(self, x): x = self.pool(F.relu(self.conv1(x))) x = self.pool(F.relu(self.conv2(x))) x = x.view(-1, 16 * 5 * 5) x = F.relu(self.fc1(x)) x = F.relu(self.fc2(x)) x = self.fc3(x) return x#训练&验证 # 定义损失函数和优化器 device = torch.device('cuda:0' if torch.cuda.is_available() else 'cpu') # 交叉熵 criterion = torch.nn.CrossEntropyLoss() # 优化器 optimizer = torch.optim.Adam(Resnet50.parameters(), lr=lr) epoch = max_epochs My_model = DemoModel() My_model = My_model.to(device) total_step = len(train_loader) train_all_loss = [] val_all_loss = [] for i in range(epoch): My_model.train() train_total_loss = 0 train_total_num = 0 train_total_correct = 0 for iter, (images,labels) in enumerate(train_loader): images = images.to(device) labels = labels.to(device) outputs = My_model(images) loss = criterion(outputs,labels) train_total_correct += (outputs.argmax(1) == labels).sum().item() #backword optimizer.zero_grad() loss.backward() optimizer.step() train_total_num += labels.shape[0] train_total_loss += loss.item() print("Epoch [{}/{}], Iter [{}/{}], train_loss:{:4f}".format(i+1,epoch,iter+1,total_step,loss.item()/labels.shape[0])) My_model.eval() test_total_loss = 0 test_total_correct = 0 test_total_num = 0 for iter,(images,labels) in enumerate(test_loader): images = images.to(device) labels = labels.to(device) outputs = My_model(images) loss = criterion(outputs,labels) test_total_correct += (outputs.argmax(1) == labels).sum().item() test_total_loss += loss.item() test_total_num += labels.shape[0] print("Epoch [{}/{}], train_loss:{:.4f}, train_acc:{:.4f}%, test_loss:{:.4f}, test_acc:{:.4f}%".format( i+1, epoch, train_total_loss / train_total_num, train_total_correct / train_total_num * 100, test_total_loss / test_total_num, test_total_correct / test_total_num * 100 )) train_all_loss.append(np.round(train_total_loss / train_total_num,4)) test_all_loss.append(np.round(test_total_loss / test_total_num,4))

0
0
0
浏览量2031
清晨我上码

Pytorch10天入门-day06-复杂模型构建

1、PyTorch 复杂模型构建1、模型截图2、模型部件实现3、模型组装2、模型定义2.1、Sequential1、当模型的前向计算为简单串联各个层的计算时, Sequential 类可以通过更加简单的方式定义模型。2、可以接收一个子模块的有序字典(OrderedDict) 或者一系列子模块作为参数来逐一添加 Module 的实例,模型的前向计算就是将这些实例按添加的顺序逐⼀计算3、使用Sequential定义模型的好处在于简单、易读,同时使用Sequential定义的模型不需要再写forwardimport torch.nn as nn net = nn.Sequential( nn.Linear(784, 256), nn.ReLU(), nn.Linear(256, 10), ) print(net)Sequential( (0): Linear(in_features=784, out_features=256, bias=True) (1): ReLU() (2): Linear(in_features=256, out_features=10, bias=True) )import collections import torch.nn as nn net2 = nn.Sequential(collections.OrderedDict([ ('fc1', nn.Linear(784, 256)), ('relu1', nn.ReLU()), ('fc2', nn.Linear(256, 10)) ])) print(net2)Sequential( (fc1): Linear(in_features=784, out_features=256, bias=True) (relu1): ReLU() (fc2): Linear(in_features=256, out_features=10, bias=True) )2.2、ModuleListModuleList 接收一个子模块(或层,需属于nn.Module类)的列表作为输入,然后也可以类似List那样进行append和extend操作nn.ModuleList 并没有定义一个网络,它只是将不同的模块储存在一起。ModuleList中元素的先后顺序并不代表其在网络中的真实位置顺序net = nn.ModuleList([nn.Linear(784, 256), nn.ReLU()]) net.append(nn.Linear(256, 10)) # # 类似List的append操作 print(net[-1]) # 类似List的索引访问 print(net)Linear(in_features=256, out_features=10, bias=True) ModuleList( (0): Linear(in_features=784, out_features=256, bias=True) (1): ReLU() (2): Linear(in_features=256, out_features=10, bias=True) )2.3、ModuleDictModuleList 接收一个子模块(或层,需属于nn.Module类)的列表作为输入,然后也可以类似List那样进行append和extend操作增加子模块或层的同时权重也会自动添加到网络中来net = nn.ModuleDict({ 'linear': nn.Linear(784, 256), 'act': nn.ReLU(), }) net['output'] = nn.Linear(256, 10) # 添加 print(net['linear']) # 访问 print(net.output) print(net)net = nn.ModuleDict({ 'linear': nn.Linear(784, 256), 'act': nn.ReLU(), }) net['output'] = nn.Linear(256, 10) # 添加 print(net['linear']) # 访问 print(net.output) print(net)3、手搓Restnet503.1、Restnet50resnet 在imageNet竞赛中分类任务第一名、目标检测第一名,获得COCO数据集中目标检测第一名,图像分割第一名。3.2、手搓思路resnet50讲解,网络的输入照片大小是224x224的经过conv1,conv2,conv3,conv4,conv5最后在平均池化,全连接层。由于中间有重复利用的模块,所以我们需要将它们写成一个类,用来重复调用即可3.3、resetnet核心要点:1、提出residual模块(残差)2、使用Batch Normalization加速训练(均值为0,方差为1)3.4 模型结构解析(restnet50)1、conv1,stride=2,kernel_size=7,out_chnnels=642、conv2_x2.1、 max_pool:kernel_size=3, stride=22.2、 conv_01:stride=1,kernel_size=1,out_chnnels=642.3、 conv_02:stride=2,kernel_size=3,out_chnnels=642.4、 conv_03:stride=1,kernel_size=1,out_chnnels=2562.5、 layers(conv_01+conv_02+conv_03)*33、conv3_x3.1、conv_01:stride=1,kernel_size=1,out_chnnels=1283.2、conv_02:stride=2,kernel_size=3,out_chnnels=1283.3、conv_03:stride=1,kernel_size=1,out_chnnels=5123.4、residual:stride=2,kernel_size=1,out_chnnels=5123.5、layers(conv_01+conv_02+conv_03)*44、conv4_x4.1、conv_01:stride=1,kernel_size=1,out_chnnels=2564.2、conv_02:stride=2,kernel_size=3,out_chnnels=2564.3、conv_03:stride=1,kernel_size=1,out_chnnels=10244.4、residual:stride=2,kernel_size=1,out_chnnels=10244.5、layers(conv_01+conv_02+conv_03)*65、conv5_x5.1、conv_01:stride=1,kernel_size=1,out_chnnels=5125.2、conv_02:stride=2,kernel_size=3,out_chnnels=5125.3、conv_03:stride=1,kernel_size=1,out_chnnels=20485.4、residual:stride=2,kernel_size=1,out_chnnels=20485.5、layers(conv_01+conv_02+conv_03)*36、fc6.1、AdaptiveAvgPool2d:output=(1,1)6.2、flatten:(x, 1)6.3、fc:linear(512 * 4,num_class)import torch.nn as nn import torch class Block(nn.Module): def __init__(self, in_channels, out_channels, stride=1, downsample=False): super(Block, self).__init__() out_channel_01, out_channel_02, out_channel_03 = out_channels self.downsample = downsample self.relu = nn.ReLU(inplace=True) self.conv1 = nn.Sequential( nn.Conv2d(in_channels, out_channel_01, kernel_size=1, stride=1,bias=False), nn.BatchNorm2d(out_channel_01), nn.ReLU(inplace=True) ) self.conv2 = nn.Sequential( nn.Conv2d(out_channel_01, out_channel_02, kernel_size=3, stride=stride, padding=1, bias=False), nn.BatchNorm2d(out_channel_02), nn.ReLU(inplace=True) ) self.conv3 = nn.Sequential( nn.Conv2d(out_channel_02, out_channel_03, kernel_size=1, stride=1, bias=False), nn.BatchNorm2d(out_channel_03), ) if downsample: self.shortcut = nn.Sequential( nn.Conv2d(in_channels, out_channel_03, kernel_size=1, stride=stride, bias=False), nn.BatchNorm2d(out_channel_03) ) def forward(self,x): x_shortcut = x x = self.conv1(x) x = self.conv2(x) x = self.conv3(x) if self.downsample: x_shortcut = self.shortcut(x_shortcut) x = x + x_shortcut x = self.relu(x) return xclass Resnet50(nn.Module): def __init__(self): super(Resnet50,self).__init__() self.conv1 = nn.Sequential( nn.Conv2d(3, 64, kernel_size=7, stride=2, padding=3), nn.BatchNorm2d(64), nn.ReLU(), ) Layers = [3, 4, 6, 3] self.maxpool = nn.MaxPool2d(kernel_size=3, stride=2, padding=1) self.conv2 = self._make_layer(64, (64, 64, 256), Layers[0],1) self.conv3 = self._make_layer(256, (128, 128, 512), Layers[1], 2) self.conv4 = self._make_layer(512, (256, 256, 1024), Layers[2], 2) self.conv5 = self._make_layer(1024, (512, 512, 2048), Layers[3], 2) self.avgpool = nn.AdaptiveAvgPool2d((1, 1)) self.fc = nn.Sequential( nn.Linear(2048, 1000) ) def forward(self, input): x = self.conv1(input) x = self.maxpool(x) x = self.conv2(x) x = self.conv3(x) x = self.conv4(x) x = self.conv5(x) x = self.avgpool(x) x = torch.flatten(x, 1) x = self.fc(x) return x def _make_layer(self, in_channels, out_channels, blocks, stride=1): layers = [] block_1 = Block(in_channels, out_channels, stride=stride, downsample=True) layers.append(block_1) for i in range(1, blocks): layers.append(Block(out_channels[2], out_channels, stride=1, downsample=False)) return nn.Sequential(*layers)#打印网络结构 net = Resnet50() x = torch.rand((10, 3, 224, 224)) for name,layer in net.named_children(): if name != "fc": x = layer(x) print(name, 'output shaoe:', x.shape) else: x = x.view(x.size(0), -1) x = layer(x) print(name, 'output shaoe:', x.shape)conv1 output shaoe: torch.Size([10, 64, 112, 112]) maxpool output shaoe: torch.Size([10, 64, 56, 56]) conv2 output shaoe: torch.Size([10, 256, 56, 56]) conv3 output shaoe: torch.Size([10, 512, 28, 28]) conv4 output shaoe: torch.Size([10, 1024, 14, 14]) conv5 output shaoe: torch.Size([10, 2048, 7, 7]) avgpool output shaoe: torch.Size([10, 2048, 1, 1]) fc output shaoe: torch.Size([10, 1000])#torchinfo 可视化网络结构 from torchinfo import summary net = Resnet50() summary(net,((10, 3, 224, 224)))========================================================================================== Layer (type:depth-idx) Output Shape Param # ========================================================================================== Resnet50 [10, 1000] -- ├─Sequential: 1-1 [10, 64, 112, 112] -- │ └─Conv2d: 2-1 [10, 64, 112, 112] 9,472 │ └─BatchNorm2d: 2-2 [10, 64, 112, 112] 128 │ └─ReLU: 2-3 [10, 64, 112, 112] -- ├─MaxPool2d: 1-2 [10, 64, 56, 56] -- ├─Sequential: 1-3 [10, 256, 56, 56] -- │ └─Block: 2-4 [10, 256, 56, 56] -- │ │ └─Sequential: 3-1 [10, 64, 56, 56] 4,224 │ │ └─Sequential: 3-2 [10, 64, 56, 56] 36,992 │ │ └─Sequential: 3-3 [10, 256, 56, 56] 16,896 │ │ └─Sequential: 3-4 [10, 256, 56, 56] 16,896 │ │ └─ReLU: 3-5 [10, 256, 56, 56] -- │ └─Block: 2-5 [10, 256, 56, 56] -- │ │ └─Sequential: 3-6 [10, 64, 56, 56] 16,512 │ │ └─Sequential: 3-7 [10, 64, 56, 56] 36,992 │ │ └─Sequential: 3-8 [10, 256, 56, 56] 16,896 │ │ └─ReLU: 3-9 [10, 256, 56, 56] -- │ └─Block: 2-6 [10, 256, 56, 56] -- │ │ └─Sequential: 3-10 [10, 64, 56, 56] 16,512 │ │ └─Sequential: 3-11 [10, 64, 56, 56] 36,992 │ │ └─Sequential: 3-12 [10, 256, 56, 56] 16,896 │ │ └─ReLU: 3-13 [10, 256, 56, 56] -- ├─Sequential: 1-4 [10, 512, 28, 28] -- │ └─Block: 2-7 [10, 512, 28, 28] -- │ │ └─Sequential: 3-14 [10, 128, 56, 56] 33,024 │ │ └─Sequential: 3-15 [10, 128, 28, 28] 147,712 │ │ └─Sequential: 3-16 [10, 512, 28, 28] 66,560 │ │ └─Sequential: 3-17 [10, 512, 28, 28] 132,096 │ │ └─ReLU: 3-18 [10, 512, 28, 28] -- │ └─Block: 2-8 [10, 512, 28, 28] -- │ │ └─Sequential: 3-19 [10, 128, 28, 28] 65,792 │ │ └─Sequential: 3-20 [10, 128, 28, 28] 147,712 │ │ └─Sequential: 3-21 [10, 512, 28, 28] 66,560 │ │ └─ReLU: 3-22 [10, 512, 28, 28] -- │ └─Block: 2-9 [10, 512, 28, 28] -- │ │ └─Sequential: 3-23 [10, 128, 28, 28] 65,792 │ │ └─Sequential: 3-24 [10, 128, 28, 28] 147,712 │ │ └─Sequential: 3-25 [10, 512, 28, 28] 66,560 │ │ └─ReLU: 3-26 [10, 512, 28, 28] -- │ └─Block: 2-10 [10, 512, 28, 28] -- │ │ └─Sequential: 3-27 [10, 128, 28, 28] 65,792 │ │ └─Sequential: 3-28 [10, 128, 28, 28] 147,712 │ │ └─Sequential: 3-29 [10, 512, 28, 28] 66,560 │ │ └─ReLU: 3-30 [10, 512, 28, 28] -- ├─Sequential: 1-5 [10, 1024, 14, 14] -- │ └─Block: 2-11 [10, 1024, 14, 14] -- │ │ └─Sequential: 3-31 [10, 256, 28, 28] 131,584 │ │ └─Sequential: 3-32 [10, 256, 14, 14] 590,336 │ │ └─Sequential: 3-33 [10, 1024, 14, 14] 264,192 │ │ └─Sequential: 3-34 [10, 1024, 14, 14] 526,336 │ │ └─ReLU: 3-35 [10, 1024, 14, 14] -- │ └─Block: 2-12 [10, 1024, 14, 14] -- │ │ └─Sequential: 3-36 [10, 256, 14, 14] 262,656 │ │ └─Sequential: 3-37 [10, 256, 14, 14] 590,336 │ │ └─Sequential: 3-38 [10, 1024, 14, 14] 264,192 │ │ └─ReLU: 3-39 [10, 1024, 14, 14] -- │ └─Block: 2-13 [10, 1024, 14, 14] -- │ │ └─Sequential: 3-40 [10, 256, 14, 14] 262,656 │ │ └─Sequential: 3-41 [10, 256, 14, 14] 590,336 │ │ └─Sequential: 3-42 [10, 1024, 14, 14] 264,192 │ │ └─ReLU: 3-43 [10, 1024, 14, 14] -- │ └─Block: 2-14 [10, 1024, 14, 14] -- │ │ └─Sequential: 3-44 [10, 256, 14, 14] 262,656 │ │ └─Sequential: 3-45 [10, 256, 14, 14] 590,336 │ │ └─Sequential: 3-46 [10, 1024, 14, 14] 264,192 │ │ └─ReLU: 3-47 [10, 1024, 14, 14] -- │ └─Block: 2-15 [10, 1024, 14, 14] -- │ │ └─Sequential: 3-48 [10, 256, 14, 14] 262,656 │ │ └─Sequential: 3-49 [10, 256, 14, 14] 590,336 │ │ └─Sequential: 3-50 [10, 1024, 14, 14] 264,192 │ │ └─ReLU: 3-51 [10, 1024, 14, 14] -- │ └─Block: 2-16 [10, 1024, 14, 14] -- │ │ └─Sequential: 3-52 [10, 256, 14, 14] 262,656 │ │ └─Sequential: 3-53 [10, 256, 14, 14] 590,336 │ │ └─Sequential: 3-54 [10, 1024, 14, 14] 264,192 │ │ └─ReLU: 3-55 [10, 1024, 14, 14] -- ├─Sequential: 1-6 [10, 2048, 7, 7] -- │ └─Block: 2-17 [10, 2048, 7, 7] -- │ │ └─Sequential: 3-56 [10, 512, 14, 14] 525,312 │ │ └─Sequential: 3-57 [10, 512, 7, 7] 2,360,320 │ │ └─Sequential: 3-58 [10, 2048, 7, 7] 1,052,672 │ │ └─Sequential: 3-59 [10, 2048, 7, 7] 2,101,248 │ │ └─ReLU: 3-60 [10, 2048, 7, 7] -- │ └─Block: 2-18 [10, 2048, 7, 7] -- │ │ └─Sequential: 3-61 [10, 512, 7, 7] 1,049,600 │ │ └─Sequential: 3-62 [10, 512, 7, 7] 2,360,320 │ │ └─Sequential: 3-63 [10, 2048, 7, 7] 1,052,672 │ │ └─ReLU: 3-64 [10, 2048, 7, 7] -- │ └─Block: 2-19 [10, 2048, 7, 7] -- │ │ └─Sequential: 3-65 [10, 512, 7, 7] 1,049,600 │ │ └─Sequential: 3-66 [10, 512, 7, 7] 2,360,320 │ │ └─Sequential: 3-67 [10, 2048, 7, 7] 1,052,672 │ │ └─ReLU: 3-68 [10, 2048, 7, 7] -- ├─AdaptiveAvgPool2d: 1-7 [10, 2048, 1, 1] -- ├─Sequential: 1-8 [10, 1000] -- │ └─Linear: 2-20 [10, 1000] 2,049,000 ========================================================================================== Total params: 25,557,096 Trainable params: 25,557,096 Non-trainable params: 0 Total mult-adds (G): 40.90 ========================================================================================== Input size (MB): 6.02 Forward/backward pass size (MB): 1778.32 Params size (MB): 102.23 Estimated Total Size (MB): 1886.57 ==========================================================================================from torch.utils.data import Dataset, DataLoader from torchvision.transforms import transforms import torchvision import os import numpy as np import torch#超参数定义 # 批次的大小 batch_size = 16 #可选32、64、128 # 优化器的学习率 lr = 1e-4 #运行epoch max_epochs = 2 # 方案一:指定GPU的方式 # os.environ['CUDA_VISIBLE_DEVICES'] = '0,1' # 指明调用的GPU为0,1号 # 方案二:使用“device”,后续对要使用GPU的变量用.to(device)即可 # device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu") # 指明调用的GPU为1号 # 数据读取 #cifar10数据集为例给出构建Dataset类的方式 from torchvision import datasets #“data_transform”可以对图像进行一定的变换,如翻转、裁剪、归一化等操作,可自己定义 data_transform=transforms.Compose([ transforms.ToTensor(), transforms.Normalize((0.5,0.5,0.5),(0.5,0.5,0.5)) ]) train_cifar_dataset = datasets.CIFAR10('cifar10',train=True, download=False,transform=data_transform) test_cifar_dataset = datasets.CIFAR10('cifar10',train=False, download=False,transform=data_transform) #构建好Dataset后,就可以使用DataLoader来按批次读入数据了 train_loader = torch.utils.data.DataLoader(train_cifar_dataset, batch_size=batch_size, num_workers=4, shuffle=True, drop_last=True) test_loader = torch.utils.data.DataLoader(test_cifar_dataset, batch_size=batch_size, num_workers=4, shuffle=False)# from tensorboard import SummaryWriter from torch.utils.tensorboard import SummaryWriter writer = SummaryWriter('./runs')#训练&验证 writer = SummaryWriter('./runs') # Set fixed random number seed torch.manual_seed(42) # 定义损失函数和优化器 device = torch.device('cuda:0' if torch.cuda.is_available() else 'cpu') My_model = Resnet50() My_model = My_model.to(device) # 交叉熵 criterion = torch.nn.CrossEntropyLoss() # 优化器 optimizer = torch.optim.Adam(My_model.parameters(), lr=lr) epoch = max_epochs total_step = len(train_loader) train_all_loss = [] test_all_loss = [] for i in range(epoch): My_model.train() train_total_loss = 0 train_total_num = 0 train_total_correct = 0 for iter, (images,labels) in enumerate(train_loader): images = images.to(device) labels = labels.to(device) # Write the network graph at epoch 0, batch 0 if epoch == 0 and iter == 0: writer.add_graph(My_model, input_to_model=(images,labels)[0], verbose=True) # Write an image at every batch 0 if iter == 0: writer.add_image("Example input", images[0], global_step=epoch) outputs = My_model(images) loss = criterion(outputs,labels) train_total_correct += (outputs.argmax(1) == labels).sum().item() #backword optimizer.zero_grad() loss.backward() optimizer.step() train_total_num += labels.shape[0] train_total_loss += loss.item() # Print statistics writer.add_scalar("Loss/Minibatches", train_total_loss, train_total_num) print("Epoch [{}/{}], Iter [{}/{}], train_loss:{:4f}".format(i+1,epoch,iter+1,total_step,loss.item()/labels.shape[0])) # Write loss for epoch writer.add_scalar("Loss/Epochs", train_total_loss, epoch) My_model.eval() test_total_loss = 0 test_total_correct = 0 test_total_num = 0 for iter,(images,labels) in enumerate(test_loader): images = images.to(device) labels = labels.to(device) outputs = My_model(images) loss = criterion(outputs,labels) test_total_correct += (outputs.argmax(1) == labels).sum().item() test_total_loss += loss.item() test_total_num += labels.shape[0] print("Epoch [{}/{}], train_loss:{:.4f}, train_acc:{:.4f}%, test_loss:{:.4f}, test_acc:{:.4f}%".format( i+1, epoch, train_total_loss / train_total_num, train_total_correct / train_total_num * 100, test_total_loss / test_total_num, test_total_correct / test_total_num * 100 )) train_all_loss.append(np.round(train_total_loss / train_total_num,4)) test_all_loss.append(np.round(test_total_loss / test_total_num,4))#方法三:tensorboard查看 writer.add_graph(net,torch.rand(10, 3, 224, 224)) writer.close()4、知识补充input:shape(10, 3, 224, 224)10:batch_size, 3:RGD ,224: width, 224:highconv:(in_depth=3, out_depth=64, kernel_size=7, stride=2, padding=3)output:width = (224-7+2*3)/2 + 1 = 112high = (224-7+2*3)/2 +1 = 112out_depth:64batch_dize:10shape:(10, 64, 112, 112)

0
0
0
浏览量2025
清晨我上码

PyTorch10天入门-day01-基础入门

Pytorch介绍由Facebook团队开发下载安装 :https://pytorch.org/get-started/locally/安装Pytorch 2.0本节重点:tensor是什么tensor四则运算tensor广播import torch import numpy as np # 张量tensor 随机初始化 x = torch.rand(4,3) print(x) y =torch.randn(4,3) print(y)tensor([[0.9480, 0.9501, 0.2717], [0.8003, 0.0821, 0.6529], [0.3265, 0.4726, 0.6464], [0.9685, 0.5453, 0.2186]]) tensor([[-0.5172, -0.1762, -1.0094], [ 0.1688, -1.6217, -0.8422], [-0.4597, -0.5814, -1.3831], [ 0.1718, 0.2061, 1.0907]])# 初始化全零 张量 a = torch.zeros((4,4),dtype=torch.long) print(a) #初始化全一 张量 b = torch.ones(4,4) print(b) c = torch.tensor(np.ones((2,3),dtype='int32')) print(c)tensor([[0, 0, 0, 0], [0, 0, 0, 0], [0, 0, 0, 0], [0, 0, 0, 0]]) tensor([[1., 1., 1., 1.], [1., 1., 1., 1.], [1., 1., 1., 1.], [1., 1., 1., 1.]]) tensor([[1, 1, 1], [1, 1, 1]], dtype=torch.int32)常见构造Tensor的方法:# tensor 的基本操作 # 加法 print(a+b) # add_ = replace in 操作 y = a.add_(3) print(y)tensor([[1., 1., 1., 1.], [1., 1., 1., 1.], [1., 1., 1., 1.], [1., 1., 1., 1.]]) tensor([[3, 3, 3, 3], [3, 3, 3, 3], [3, 3, 3, 3], [3, 3, 3, 3]])#索引操作 x = torch.rand(3,4) print(x) # 第二列 print(x[:,1]) # 第二行 print(x[1,:])tensor([[-0.0617, 2.3109, 0.0030, 0.6941], [ 0.4677, -1.9160, 0.6614, -1.7743], [-0.3349, 0.2371, 2.1070, -1.0076], [ 0.3823, -1.2401, -0.3766, -1.0454]]) tensor([[-0.0617, 2.3109, 0.0030, 0.6941, 0.4677, -1.9160, 0.6614, -1.7743], [-0.3349, 0.2371, 2.1070, -1.0076, 0.3823, -1.2401, -0.3766, -1.0454]])#广播机制 #当对两个形状不同的 Tensor 按元素运算时,可能会触发广播(broadcasting)机制:先适当复制元素使这两个 Tensor 形状相同后再按元素运算。 x = torch.arange(1,4).view(1,3) print(x) y = torch.arange(1,5).view(4,1) print(y) print(x+y)tensor([[1, 2, 3]]) tensor([[1], [2], [3], [4]]) tensor([[2, 3, 4], [3, 4, 5], [4, 5, 6], [5, 6, 7]])

0
0
0
浏览量2034
清晨我上码

PyTorch10天入门-day03-数据读取

PyTorch 数据读取GPU配置数据预处理划分训练集、验证集、测试集选择模型设定损失函数&优化方法模型效果评估本节主要讲前3部分#导入常用包 import os import numpy as np import torch from torch.utils.data import Dataset, DataLoader from torchvision.transforms import transforms超参数可以统一设置,参数初始化:batch size初始学习率(初始)训练次数(max_epochs)GPU配置#超参数定义 # 批次的大小 batch_size = 16 #可选32、64、128 # 优化器的学习率 lr = 1e-4 #运行epoch max_epochs = 10 # 方案一:指定GPU的方式 os.environ['CUDA_VISIBLE_DEVICES'] = '0,1' # 指明调用的GPU为0,1号 # 方案二:使用“device”,后续对要使用GPU的变量用.to(device)即可 device = torch.device("cuda:1" if torch.cuda.is_available() else "cpu") # 指明调用的GPU为1号Dataset类主要包含三个函数:init: 用于向类中传入外部参数,同时定义样本集getitem: 用于逐个读取样本集合中的元素,可以进行一定的变换,并将返回训练/验证所需的数据len: 用于返回数据集的样本数# 数据读取 #cifar10数据集为例给出构建Dataset类的方式 from torchvision import datasets #“data_transform”可以对图像进行一定的变换,如翻转、裁剪、归一化等操作,可自己定义 data_transform=transforms.Compose([ transforms.ToTensor(), transforms.Normalize((0.5,0.5,0.5),(0.5,0.5,0.5)) ]) train_cifar_dataset = datasets.CIFAR10('cifar10',train=True, download=False,transform=data_transform) test_cifar_dataset = datasets.CIFAR10('cifar10',train=False, download=False,transform=data_transform)#查看dataset print(test_cifar_dataset.__len__) image_demo = test_cifar_dataset.__getitem__(1)[0] print(image_demo) print(image_demo.size())<bound method CIFAR10.__len__ of Dataset CIFAR10 Number of datapoints: 10000 Root location: cifar10 Split: Test StandardTransform Transform: Compose( ToTensor() Normalize(mean=(0.5, 0.5, 0.5), std=(0.5, 0.5, 0.5)) )> tensor([[[ 0.8431, 0.8118, 0.8196, ..., 0.8275, 0.8275, 0.8196], [ 0.8667, 0.8431, 0.8431, ..., 0.8510, 0.8510, 0.8431], [ 0.8588, 0.8353, 0.8353, ..., 0.8431, 0.8431, 0.8353],#查看数据集 import matplotlib.pyplot as plt classes = ('plane', 'car', 'bird', 'cat', 'deer', 'dog', 'frog', 'horse', 'ship', 'truck') dataiter = iter(test_cifar_dataset) plt.show() for i in range(10): images, labels = dataiter.__next__() print(images.size()) print(str(classes[labels])) # images = images.numpy().transpose(1, 2, 0) # 把channel那一维放到最后 # plt.title(str(classes[labels])) # plt.imshow(images)torch.Size([3, 32, 32]) cat torch.Size([3, 32, 32]) ship torch.Size([3, 32, 32]) ship torch.Size([3, 32, 32]) plane torch.Size([3, 32, 32]) frog torch.Size([3, 32, 32]) frog torch.Size([3, 32, 32]) car torch.Size([3, 32, 32]) frog#构建好Dataset后,就可以使用DataLoader来按批次读入数据了 train_loader = torch.utils.data.DataLoader(train_cifar_dataset, batch_size=batch_size, num_workers=4, shuffle=True, drop_last=True) val_loader = torch.utils.data.DataLoader(test_cifar_dataset, batch_size=batch_size, num_workers=4, shuffle=False)参数说明:batch_size:样本是按“批”读入的,batch_size就是每次读入的样本数num_workers:有多少个进程用于读取数据,Windows下该参数设置为0,Linux下常见的为4或者8,根据自己的电脑配置来设置shuffle:是否将读入的数据打乱,一般在训练集中设置为True,验证集中设置为Falsedrop_last:对于样本最后一部分没有达到批次数的样本,使其不再参与训练#自定义 Dataset 类 class MyDataset(Dataset): def __init__(self, data_dir, info_csv, image_list, transform=None): """ Args: data_dir: path to image directory. info_csv: path to the csv file containing image indexes with corresponding labels. image_list: path to the txt file contains image names to training/validation set transform: optional transform to be applied on a sample. """ label_info = pd.read_csv(info_csv) image_file = open(image_list).readlines() self.data_dir = data_dir self.image_file = image_file self.label_info = label_info self.transform = transform def __getitem__(self, index): """ Args: index: the index of item Returns: image and its labels """ image_name = self.image_file[index].strip('\n') raw_label = self.label_info.loc[self.label_info['Image_index'] == image_name] label = raw_label.iloc[:,0] image_name = os.path.join(self.data_dir, image_name) image = Image.open(image_name).convert('RGB') if self.transform is not None: image = self.transform(image) return image, label def __len__(self): return len(self.image_file)#自定义 dataset demo data_dir = '' info_csv = '' image_list = '' my_dataset = MyDataset(data_dir,info_csv,image_list)

0
0
0
浏览量2036