阿里云天池大赛赛题(机器学习)——阿里云安全恶意程序检测(完整代码)

这篇具有很好参考价值的文章主要介绍了阿里云天池大赛赛题(机器学习)——阿里云安全恶意程序检测(完整代码)。希望对大家有所帮助。如果存在错误或未考虑完全的地方,请大家不吝赐教,您也可以点击"举报违法"按钮提交疑问。

赛题背景

阿里云作为国内最大的云服务提供商,每天都面临着网络上海量的恶意攻击。
本题目提供的一堆恶意文件数据,包括感染性病毒、木马程序、挖矿程序、DDoS木马、勒索病毒等等,总计6亿条数据,每个文件数据会有对API调用顺序及线程等相关信息,我们需要训练模型,将测试文件正确归类(预测出是哪种病毒),因此是典型的多分类问题
常见的分类算法:朴素贝叶斯决策树支持向量机KNN逻辑回归等等;
集成学习:随机森林GBDT(梯度提升决策树),AdabootXGBoostLightGBMCatBoost等等;
神经网络:MLP(多层神经网络),DL(深度学习)等。

全代码(ML 和 DL)

一个典型的机器学习实战算法基本包括 1) 数据处理,2) 特征选取、优化,和 3) 模型选取、验证、优化。 因为 “数据和特征决定了机器学习的上限,而模型和算法知识逼近这个上限而已。” 所以在解决一个机器学习问题时大部分时间都会花在数据处理和特征优化上。
大家最好在jupyter notebook上一段一段地跑下面的代码,加深理解。
机器学习的基本知识可以康康我的其他文章哦 好康的。

特征工程进阶与方案优化 代码

import pandas as pd
import numpy as np
import seaborn as sns
import matplotlib.pyplot as plt

import lightgbm as lgb
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import OneHotEncoder

from tqdm import tqdm_notebook

import warnings
warnings.filterwarnings('ignore')
%matplotlib inline
# 内存管理
import numpy as np
import pandas as pd
from tqdm import tqdm  

class _Data_Preprocess:
    def __init__(self):
        self.int8_max = np.iinfo(np.int8).max
        self.int8_min = np.iinfo(np.int8).min

        self.int16_max = np.iinfo(np.int16).max
        self.int16_min = np.iinfo(np.int16).min

        self.int32_max = np.iinfo(np.int32).max
        self.int32_min = np.iinfo(np.int32).min

        self.int64_max = np.iinfo(np.int64).max
        self.int64_min = np.iinfo(np.int64).min

        self.float16_max = np.finfo(np.float16).max
        self.float16_min = np.finfo(np.float16).min

        self.float32_max = np.finfo(np.float32).max
        self.float32_min = np.finfo(np.float32).min

        self.float64_max = np.finfo(np.float64).max
        self.float64_min = np.finfo(np.float64).min

    def _get_type(self, min_val, max_val, types):
        if types == 'int':
            if max_val <= self.int8_max and min_val >= self.int8_min:
                return np.int8
            elif max_val <= self.int16_max <= max_val and min_val >= self.int16_min:
                return np.int16
            elif max_val <= self.int32_max and min_val >= self.int32_min:
                return np.int32
            return None

        elif types == 'float':
            if max_val <= self.float16_max and min_val >= self.float16_min:
                return np.float16
            if max_val <= self.float32_max and min_val >= self.float32_min:
                return np.float32
            if max_val <= self.float64_max and min_val >= self.float64_min:
                return np.float64
            return None

    def _memory_process(self, df):
        init_memory = df.memory_usage().sum() / 1024 ** 2 / 1024
        print('Original data occupies {} GB memory.'.format(init_memory))
        df_cols = df.columns

          
        for col in tqdm_notebook(df_cols):
            try:
                if 'float' in str(df[col].dtypes):
                    max_val = df[col].max()
                    min_val = df[col].min()
                    trans_types = self._get_type(min_val, max_val, 'float')
                    if trans_types is not None:
                        df[col] = df[col].astype(trans_types)
                elif 'int' in str(df[col].dtypes):
                    max_val = df[col].max()
                    min_val = df[col].min()
                    trans_types = self._get_type(min_val, max_val, 'int')
                    if trans_types is not None:
                        df[col] = df[col].astype(trans_types)
            except:
                print(' Can not do any process for column, {}.'.format(col)) 
        afterprocess_memory = df.memory_usage().sum() / 1024 ** 2 / 1024
        print('After processing, the data occupies {} GB memory.'.format(afterprocess_memory))
        return df
memory_process = _Data_Preprocess()
### 数据读取
path  = '../security_data/'
train = pd.read_csv(path + 'security_train.csv')
test  = pd.read_csv(path + 'security_test.csv')
train.head()
def simple_sts_features(df):
    simple_fea             = pd.DataFrame()
    simple_fea['file_id']  = df['file_id'].unique()
    simple_fea             = simple_fea.sort_values('file_id')
     
    df_grp = df.groupby('file_id')
    simple_fea['file_id_api_count']   = df_grp['api'].count().values
    simple_fea['file_id_api_nunique'] = df_grp['api'].nunique().values
    
    simple_fea['file_id_tid_count']   = df_grp['tid'].count().values
    simple_fea['file_id_tid_nunique'] = df_grp['tid'].nunique().values
    
    simple_fea['file_id_index_count']   = df_grp['index'].count().values
    simple_fea['file_id_index_nunique'] = df_grp['index'].nunique().values
    
    return simple_fea
%%time
simple_train_fea1 = simple_sts_features(train)
%%time
simple_test_fea1 = simple_sts_features(test)
def simple_numerical_sts_features(df):
    simple_numerical_fea             = pd.DataFrame()
    simple_numerical_fea['file_id']  = df['file_id'].unique()
    simple_numerical_fea             = simple_numerical_fea.sort_values('file_id')
     
    df_grp = df.groupby('file_id')
    
    simple_numerical_fea['file_id_tid_mean']  = df_grp['tid'].mean().values
    simple_numerical_fea['file_id_tid_min']   = df_grp['tid'].min().values
    simple_numerical_fea['file_id_tid_std']   = df_grp['tid'].std().values
    simple_numerical_fea['file_id_tid_max']   = df_grp['tid'].max().values
    
    
    simple_numerical_fea['file_id_index_mean']= df_grp['index'].mean().values
    simple_numerical_fea['file_id_index_min'] = df_grp['index'].min().values
    simple_numerical_fea['file_id_index_std'] = df_grp['index'].std().values
    simple_numerical_fea['file_id_index_max'] = df_grp['index'].max().values
    
    return simple_numerical_fea
%%time
simple_train_fea2 = simple_numerical_sts_features(train)
%%time
simple_test_fea2 = simple_numerical_sts_features(test)

特征工程进阶部分

def api_pivot_count_features(df):
    tmp = df.groupby(['file_id','api'])['tid'].count().to_frame('api_tid_count').reset_index()
    tmp_pivot = pd.pivot_table(data=tmp,index = 'file_id',columns='api',values='api_tid_count',fill_value=0)
    tmp_pivot.columns = [tmp_pivot.columns.names[0] + '_pivot_'+ str(col) for col in tmp_pivot.columns]
    tmp_pivot.reset_index(inplace = True)
    tmp_pivot = memory_process._memory_process(tmp_pivot)
    return tmp_pivot 
%%time
simple_train_fea3 = api_pivot_count_features(train)
%%time
simple_test_fea3 = api_pivot_count_features(test)
def api_pivot_nunique_features(df):
    tmp = df.groupby(['file_id','api'])['tid'].nunique().to_frame('api_tid_nunique').reset_index()
    tmp_pivot = pd.pivot_table(data=tmp,index = 'file_id',columns='api',values='api_tid_nunique',fill_value=0)
    tmp_pivot.columns = [tmp_pivot.columns.names[0] + '_pivot_'+ str(col) for col in tmp_pivot.columns]
    tmp_pivot.reset_index(inplace = True)
    tmp_pivot = memory_process._memory_process(tmp_pivot)
    return tmp_pivot 
%%time
simple_train_fea4 = api_pivot_count_features(train)
%%time
simple_test_fea4 = api_pivot_count_features(test)
train_label = train[['file_id','label']].drop_duplicates(subset = ['file_id','label'], keep = 'first')
test_submit = test[['file_id']].drop_duplicates(subset = ['file_id'], keep = 'first')
train_data = train_label.merge(simple_train_fea1, on ='file_id', how='left')
train_data = train_data.merge(simple_train_fea2, on ='file_id', how='left')
train_data = train_data.merge(simple_train_fea3, on ='file_id', how='left')
train_data = train_data.merge(simple_train_fea4, on ='file_id', how='left')
test_submit = test_submit.merge(simple_test_fea1, on ='file_id', how='left')
test_submit = test_submit.merge(simple_test_fea2, on ='file_id', how='left')
test_submit = test_submit.merge(simple_test_fea3, on ='file_id', how='left')
test_submit = test_submit.merge(simple_test_fea4, on ='file_id', how='left')
### 评估指标构建
def lgb_logloss(preds,data):
    labels_ = data.get_label()             
    classes_ = np.unique(labels_) 
    preds_prob = []
    for i in range(len(classes_)):
        preds_prob.append(preds[i*len(labels_):(i+1) * len(labels_)] )
        
    preds_prob_ = np.vstack(preds_prob) 
    
    loss = []
    for i in range(preds_prob_.shape[1]):     
        sum_ = 0
        for j in range(preds_prob_.shape[0]): 
            pred = preds_prob_[j,i]           
            if  j == labels_[i]:
                sum_ += np.log(pred)
            else:
                sum_ += np.log(1 - pred)
        loss.append(sum_)       
    return 'loss is: ',-1 * (np.sum(loss) / preds_prob_.shape[1]),False

基于LightGBM 的模型验证

train_features = [col for col in train_data.columns if col not in ['label','file_id']]
train_label    = 'label'
%%time
from sklearn.model_selection import StratifiedKFold,KFold
params = {
        'task':'train', 
        'num_leaves': 255,
        'objective': 'multiclass',
        'num_class': 8,
        'min_data_in_leaf': 50,
        'learning_rate': 0.05,
        'feature_fraction': 0.85,
        'bagging_fraction': 0.85,
        'bagging_freq': 5, 
        'max_bin':128,
        'random_state':100
    }   

folds = KFold(n_splits=5, shuffle=True, random_state=15)
oof = np.zeros(len(train))

predict_res = 0
models = []
for fold_, (trn_idx, val_idx) in enumerate(folds.split(train_data)):
    print("fold n°{}".format(fold_))
    trn_data = lgb.Dataset(train_data.iloc[trn_idx][train_features], label=train_data.iloc[trn_idx][train_label].values)
    val_data = lgb.Dataset(train_data.iloc[val_idx][train_features], label=train_data.iloc[val_idx][train_label].values) 
    
    clf = lgb.train(params, trn_data, num_boost_round=2000,valid_sets=[trn_data,val_data], verbose_eval=50, early_stopping_rounds=100, feval=lgb_logloss) 
    models.append(clf)
plt.figure(figsize=[10,8])
sns.heatmap(train_data.iloc[:10000, 1:21].corr())
### 特征重要性分析
feature_importance             = pd.DataFrame()
feature_importance['fea_name'] = train_features
feature_importance['fea_imp']  = clf.feature_importance()
feature_importance             = feature_importance.sort_values('fea_imp',ascending = False)
feature_importance.sort_values('fea_imp',ascending = False)
plt.figure(figsize=[20, 10,])
plt.figure(figsize=[20, 10,])
sns.barplot(x = feature_importance.iloc[:10]['fea_name'], y = feature_importance.iloc[:10]['fea_imp'])
plt.figure(figsize=[20, 10,])
sns.barplot(x = feature_importance['fea_name'], y = feature_importance['fea_imp'])

模型测试

pred_res = 0
flod = 5
for model in models:
    pred_res += model.predict(test_submit[train_features]) * 1.0 / flod

test_submit['prob0'] = 0
test_submit['prob1'] = 0
test_submit['prob2'] = 0
test_submit['prob3'] = 0
test_submit['prob4'] = 0
test_submit['prob5'] = 0
test_submit['prob6'] = 0
test_submit['prob7'] = 0

test_submit[['prob0','prob1','prob2','prob3','prob4','prob5','prob6','prob7']] = pred_res
test_submit[['file_id','prob0','prob1','prob2','prob3','prob4','prob5','prob6','prob7']].to_csv('baseline2.csv',index = None)

 

深度学习解决方案:TextCNN建模 代码

数据读取

import pandas as pd
import numpy as np
import seaborn as sns
import matplotlib.pyplot as plt

import lightgbm as lgb
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import OneHotEncoder

from tqdm import tqdm_notebook
from sklearn.preprocessing import LabelBinarizer,LabelEncoder

import warnings
warnings.filterwarnings('ignore')
%matplotlib inline

path  = '../security_data/'
train = pd.read_csv(path + 'security_train.csv')
test  = pd.read_csv(path + 'security_test.csv')

import numpy as np
import pandas as pd
from tqdm import tqdm  

class _Data_Preprocess:
    def __init__(self):
        self.int8_max = np.iinfo(np.int8).max
        self.int8_min = np.iinfo(np.int8).min

        self.int16_max = np.iinfo(np.int16).max
        self.int16_min = np.iinfo(np.int16).min

        self.int32_max = np.iinfo(np.int32).max
        self.int32_min = np.iinfo(np.int32).min

        self.int64_max = np.iinfo(np.int64).max
        self.int64_min = np.iinfo(np.int64).min

        self.float16_max = np.finfo(np.float16).max
        self.float16_min = np.finfo(np.float16).min

        self.float32_max = np.finfo(np.float32).max
        self.float32_min = np.finfo(np.float32).min

        self.float64_max = np.finfo(np.float64).max
        self.float64_min = np.finfo(np.float64).min

    def _get_type(self, min_val, max_val, types):
        if types == 'int':
            if max_val <= self.int8_max and min_val >= self.int8_min:
                return np.int8
            elif max_val <= self.int16_max <= max_val and min_val >= self.int16_min:
                return np.int16
            elif max_val <= self.int32_max and min_val >= self.int32_min:
                return np.int32
            return None

        elif types == 'float':
            if max_val <= self.float16_max and min_val >= self.float16_min:
                return np.float16
            if max_val <= self.float32_max and min_val >= self.float32_min:
                return np.float32
            if max_val <= self.float64_max and min_val >= self.float64_min:
                return np.float64
            return None

    def _memory_process(self, df):
        init_memory = df.memory_usage().sum() / 1024 ** 2 / 1024
        print('Original data occupies {} GB memory.'.format(init_memory))
        df_cols = df.columns

          
        for col in tqdm_notebook(df_cols):
            try:
                if 'float' in str(df[col].dtypes):
                    max_val = df[col].max()
                    min_val = df[col].min()
                    trans_types = self._get_type(min_val, max_val, 'float')
                    if trans_types is not None:
                        df[col] = df[col].astype(trans_types)
                elif 'int' in str(df[col].dtypes):
                    max_val = df[col].max()
                    min_val = df[col].min()
                    trans_types = self._get_type(min_val, max_val, 'int')
                    if trans_types is not None:
                        df[col] = df[col].astype(trans_types)
            except:
                print(' Can not do any process for column, {}.'.format(col)) 
        afterprocess_memory = df.memory_usage().sum() / 1024 ** 2 / 1024
        print('After processing, the data occupies {} GB memory.'.format(afterprocess_memory))
        return df

memory_process = _Data_Preprocess()

train.head()

数据预处理

# (字符串转化为数字)
unique_api = train['api'].unique()

api2index = {item:(i+1) for i,item in enumerate(unique_api)}
index2api = {(i+1):item for i,item in enumerate(unique_api)}

train['api_idx'] = train['api'].map(api2index)
test['api_idx']  = test['api'].map(api2index)

# 获取每个文件对应的字符串序列
def get_sequence(df,period_idx):
    seq_list = []
    for _id,begin in enumerate(period_idx[:-1]):
        seq_list.append(df.iloc[begin:period_idx[_id+1]]['api_idx'].values)
    seq_list.append(df.iloc[period_idx[-1]:]['api_idx'].values)
    return seq_list

train_period_idx = train.file_id.drop_duplicates(keep='first').index.values
test_period_idx  = test.file_id.drop_duplicates(keep='first').index.values

train_df = train[['file_id','label']].drop_duplicates(keep='first')
test_df  = test[['file_id']].drop_duplicates(keep='first')

train_df['seq'] = get_sequence(train,train_period_idx)
test_df['seq']  = get_sequence(test,test_period_idx)

TextCNN网络结构

from keras.preprocessing.text import Tokenizer
from keras.preprocessing.sequence import pad_sequences
from keras.layers import Dense, Input, LSTM, Lambda, Embedding, Dropout, Activation,GRU,Bidirectional
from keras.layers import Conv1D,Conv2D,MaxPooling2D,GlobalAveragePooling1D,GlobalMaxPooling1D, MaxPooling1D, Flatten
from keras.layers import CuDNNGRU, CuDNNLSTM, SpatialDropout1D
from keras.layers.merge import concatenate, Concatenate, Average, Dot, Maximum, Multiply, Subtract, average
from keras.models import Model
from keras.optimizers import RMSprop,Adam
from keras.layers.normalization import BatchNormalization
from keras.callbacks import EarlyStopping, ModelCheckpoint
from keras.optimizers import SGD
from keras import backend as K
from sklearn.decomposition import TruncatedSVD, NMF, LatentDirichletAllocation
from keras.layers import SpatialDropout1D
from keras.layers.wrappers import Bidirectional
def TextCNN(max_len,max_cnt,embed_size, num_filters,kernel_size,conv_action, mask_zero):
    
    _input = Input(shape=(max_len,), dtype='int32')
    _embed = Embedding(max_cnt, embed_size, input_length=max_len, mask_zero=mask_zero)(_input)
    _embed = SpatialDropout1D(0.15)(_embed)
    warppers = []
    
    for _kernel_size in kernel_size:
        conv1d = Conv1D(filters=num_filters, kernel_size=_kernel_size, activation=conv_action)(_embed)
        warppers.append(GlobalMaxPooling1D()(conv1d))
                        
    fc = concatenate(warppers)
    fc = Dropout(0.5)(fc)
    #fc = BatchNormalization()(fc)
    fc = Dense(256, activation='relu')(fc)
    fc = Dropout(0.25)(fc)
    #fc = BatchNormalization()(fc) 
    preds = Dense(8, activation = 'softmax')(fc)
    
    model = Model(inputs=_input, outputs=preds)
    
    model.compile(loss='categorical_crossentropy',
        optimizer='adam',
        metrics=['accuracy'])
    return model

train_labels = pd.get_dummies(train_df.label).values
train_seq    = pad_sequences(train_df.seq.values, maxlen = 6000)
test_seq     = pad_sequences(test_df.seq.values, maxlen = 6000)

TextCNN训练和预测

from sklearn.model_selection import StratifiedKFold,KFold 
skf = KFold(n_splits=5, shuffle=True)

max_len     = 6000
max_cnt     = 295
embed_size  = 256
num_filters = 64
kernel_size = [2,4,6,8,10,12,14]
conv_action = 'relu'
mask_zero   = False
TRAIN       = True

import os
os.environ["CUDA_VISIBLE_DEVICES"] = "0,1"
meta_train = np.zeros(shape = (len(train_seq),8))
meta_test = np.zeros(shape = (len(test_seq),8))
FLAG = True
i = 0
for tr_ind,te_ind in skf.split(train_labels):
    i +=1
    print('FOLD: '.format(i))
    print(len(te_ind),len(tr_ind)) 
    model_name = 'benchmark_textcnn_fold_'+str(i)
    X_train,X_train_label = train_seq[tr_ind],train_labels[tr_ind]
    X_val,X_val_label     = train_seq[te_ind],train_labels[te_ind]
    
    model = TextCNN(max_len,max_cnt,embed_size,num_filters,kernel_size,conv_action,mask_zero)
    
    model_save_path = './NN/%s_%s.hdf5'%(model_name,embed_size)
    early_stopping =EarlyStopping(monitor='val_loss', patience=3)
    model_checkpoint = ModelCheckpoint(model_save_path, save_best_only=True, save_weights_only=True)
    if TRAIN and FLAG:
        model.fit(X_train,X_train_label,validation_data=(X_val,X_val_label),epochs=100,batch_size=64,shuffle=True,callbacks=[early_stopping,model_checkpoint] )
    
    model.load_weights(model_save_path)
    pred_val = model.predict(X_val,batch_size=128,verbose=1)
    pred_test = model.predict(test_seq,batch_size=128,verbose=1)
    
    meta_train[te_ind] = pred_val
    meta_test += pred_test
    K.clear_session()
meta_test /= 5.0 

结果提交

test_df['prob0'] = 0
test_df['prob1'] = 0
test_df['prob2'] = 0
test_df['prob3'] = 0
test_df['prob4'] = 0
test_df['prob5'] = 0
test_df['prob6'] = 0
test_df['prob7'] = 0

test_df[['prob0','prob1','prob2','prob3','prob4','prob5','prob6','prob7']] = meta_test
test_df[['file_id','prob0','prob1','prob2','prob3','prob4','prob5','prob6','prob7']].to_csv('nn_baseline_5fold.csv',index = None)

以上内容和代码全部来自于《阿里云天池大赛赛题解析(机器学习篇)》这本好书,十分推荐大家去阅读原书!文章来源地址https://www.toymoban.com/news/detail-468075.html

到了这里,关于阿里云天池大赛赛题(机器学习)——阿里云安全恶意程序检测(完整代码)的文章就介绍完了。如果您还想了解更多内容,请在右上角搜索TOY模板网以前的文章或继续浏览下面的相关文章,希望大家以后多多支持TOY模板网!

本文来自互联网用户投稿,该文观点仅代表作者本人,不代表本站立场。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如若转载,请注明出处: 如若内容造成侵权/违法违规/事实不符,请点击违法举报进行投诉反馈,一经查实,立即删除!

领支付宝红包 赞助服务器费用

相关文章

  • 获奖名单公示|荣耀时刻,「第5届天池全球数据库大赛」决赛圆满收官

    日前,由阿里云主办、阿里云瑶池数据库和天池平台承办的 “ 第五届天池全球数据库大赛 ” 圆满收官。   历经过去4个多月的层层选拔, 2大赛道20支队伍 从 7047支 参赛战队中脱颖而出,成功晋级大赛决赛圈。最终 , 来自蔚来汽车等企业组队的 「带对听花」队伍 和 来自北

    2024年02月04日
    浏览(37)
  • 机器学习和深度学习检测网络安全课题:DDOS检测、恶意软件、恶意流量检测课题资料

    开源的DDOS检测工具 https://github.com/equalitie/learn2ban 基于KDDCUP 99数据集预测DDoS攻击 基于谱分析与统计机器学习的DDoS攻击检测技术研究 基于机器学习的分布式拒绝服务攻击检测方法研究 DDoS Attacks Using Hidden Markov Models and Cooperative ReinforcementLearning* 恶意软件检测 https://github.com/dc

    2024年01月18日
    浏览(48)
  • 《天池精准医疗大赛-人工智能辅助糖尿病遗传风险预测》模型复现和数据挖掘-论文_企业

    进入21世纪,生命科学特别是基因科技已经广泛而且深刻影响到每个人的健康生活,于此同时,科学家们借助基因科技史无前例的用一种全新的视角解读生命和探究疾病本质。人工智能(AI)能够处理分析海量医疗健康数据,通过认知分析获取洞察,服务于政府、健康医疗机构

    2023年04月09日
    浏览(62)
  • 阿里云天池 天池实验室DSW探索者版 免费GPU 天池notebook教程

    1、DSW教程 点击天池notebook,进入我的实验室 选择一个私有项目,点击编辑 集成机器学习 PAI DSW (DataScienceWorkshop)探索者版开发环境 左边文件管理,中间工作区,右边是计算资源。 在文件资源管理区的顶部还有4个按钮,从左到右分别对应的是:打开DSW Launcher启动器,新建文

    2024年02月01日
    浏览(58)
  • 2017年江西省电子专题大赛赛题剖析

    【题目要求】 简易数控直直流稳压电源 一、内容 参赛学生从赛场提供的元器中选取必要的一部分,设计并制作一个简易数控直流稳压电源。用轻触按键设置输出直流电压值,电压值用共阴数码管显示。 二、要求 1、基本部分 1)、用数码管显示当前直流电压值。上电后初始

    2024年02月01日
    浏览(48)
  • 15 万奖金!开放原子开源大赛OpenAnolis 赛题@你报名

    8 月 29 日,2023 开源和信息消费大赛新闻发布会在北京召开,首届“开放原子开源大赛”正式启动报名。大赛由工业和信息化部、江苏省人民政府、湖南省人民政府共同主办,开源赛道拟由开放原子开源基金会、央视网、江苏省工业和信息化厅、无锡市人民政府、江苏软件产

    2024年02月05日
    浏览(48)
  • kaggle新赛:Bengali.AI 语音识别大赛赛题解析

    赛题名称: Bengali.AI Speech Recognition 赛题链接: https://www.kaggle.com/competitions/bengaliai-speech 竞赛主办方 Bengali.AI 致力于加速孟加拉语(当地称为孟加拉语)的语言技术研究。Bengali.AI 通过社区驱动的收集活动众包大规模数据集,并通过研究竞赛为其数据集提供众包解决方案。孟加

    2024年02月16日
    浏览(51)
  • [网络安全提高篇] 一二二.恶意样本分类之基于API序列和机器学习的恶意家族分类详解

    终于忙完初稿,开心地写一篇博客。 “网络安全提高班”新的100篇文章即将开启,包括Web渗透、内网渗透、靶场搭建、CVE复现、攻击溯源、实战及CTF总结,它将更加聚焦,更加深入,也是作者的慢慢成长史。换专业确实挺难的,Web渗透也是块硬骨头,但我也试试,看看自己未

    2024年02月12日
    浏览(61)
  • 全国职业技能大赛云计算--高职组赛题卷①(容器云)

    说明:本任务提供有4台服务器master、node1、node2和cicd-node,都安装了centos7.5操作系统,在/opt/centos目录下有CentOS-7-x86_64-DVD-1804系统光盘文件所有文件,在/opt/containerk8s目录下有本次容器云运维所需的所有文件。 某公司技术部产品开发上线周期长,客户的需求经常得不到及时响应

    2024年02月07日
    浏览(40)
  • 全国职业技能大赛云计算--高职组赛题卷②(容器云)

    说明:本任务提供有4台服务器master、node1、node2和cicd-node,都安装了centos7.5操作系统,在/opt/centos目录下有CentOS-7-x86_64-DVD-1804系统光盘文件所有文件,在/opt/containerk8s目录下有本次容器云运维所需的所有文件。 某公司技术部产品开发上线周期长,客户的需求经常得不到及时响应

    2024年02月07日
    浏览(44)

觉得文章有用就打赏一下文章作者

支付宝扫一扫打赏

博客赞助

微信扫一扫打赏

请作者喝杯咖啡吧~博客赞助

支付宝扫一扫领取红包,优惠每天领

二维码1

领取红包

二维码2

领红包