조씨의 학습일기

데이터25

[Mutect2 vs Strelka] Somatic mutation 의 차이에 대한 고찰 In Progress: Mutect2 vs Strelka2 First Pass by cansavvy · Pull Request #38 · AlexsLemonade/OpenPBTA-analysis (github.com) 데이터/Bioinformatics tools 2022. 12. 6.

Graph 자료 모음 Visualize Similarities Between Companies With Graph Database | by Khuyen Tran | Medium Visualize Similarities Between Companies With Graph Database Build and Analyze Graph Database with Neo4j khuyentran1476.medium.com Graphs with Python | by Dmytro Nikolaiev (Dimid) | Towards Data Science Graphs with Python: Overview and Best Libraries Graph analysis, interactive visualizations, and graph machin.. 데이터/Data Manipulation 2022. 10. 20.

[Python] Missingno Package : Overview of new datasets ResidentMario/missingno: Missing data visualization module for Python. (github.com) 데이터/Data Manipulation 2022. 10. 20.

[Pandas] Stratified sampling with pd.DataFrame.sample() pandas.DataFrame.sample — pandas 1.5.0 documentation (pydata.org) pandas.DataFrame.sample — pandas 1.5.0 documentation Default ‘None’ results in equal probability weighting. If passed a Series, will align with target object on index. Index values in weights not found in sampled object will be ignored and index values in sampled object not in weights will be assigned we pandas.pydata.org Pandas 에.. 데이터/Data Manipulation 2022. 10. 13.

[FASTA] Ways to split FASTA file GAAS/split_fasta.md at master · NBISweden/GAAS (github.com) 데이터/Bioinformatics tools 2022. 9. 28.

[Tool invision] BBMap BBMap download | SourceForge.net 데이터/Bioinformatics tools 2022. 9. 28.

[Matplotlib] pdf 저장 시 text 를 수정가능하게 저장하는 방법 논문 피규어 작업을 할 때 , 일러스트레이터로 pdf 나 eps 파일을 불러와서 작업하는 경우가 많아서 파이썬에서도 plot 을 그렇게 저장해야 할 때가 있다. 이렇게 벡터화시켜서 저장하면 그림은 자유롭게 조정이 가능한데, 글자도 벡터화되어서 일러스트레이터 상에서 글자로 인식을 못하는 경우가 생기는데, 그럴 경우 matplotlib 옵션에서 이 옵션을 설정해주면 일러스트레이터 상에서도 글자로 조절가능하다. import matplotlib.pyplot as plt plt.rcParams['pdf.fonttype'] = 42 plt.rcParams['ps.fonttype'] = 42 데이터/Plotting 2022. 9. 8.

[6] Stratified K-fold 의 여러가지 방법 논문을 읽던 도중 실험 데이터에 적용할 수 있는 좋은 K-fold 방법이 소개된 피규어가 있어서 가져와 보았다. 논문은 Machine learning enables detection of early-stage colorectal cancer by whole-genome sequencing of plasma cell-free DNA | BMC Cancer | Full Text (biomedcentral.com) Machine learning enables detection of early-stage colorectal cancer by whole-genome sequencing of plasma cell-free DNA - BMC Can Background Blood-based methods using c.. 데이터/Data Manipulation 2022. 9. 6.

[4] Python list comprehension 써보기. 파이썬에서 리스트 안에 리스트가 있는, "Nested list" 일 때, flatten ( nested list 를 하나의 리스트로 통합하는 ) 을 하는 좋은 방법을 소개한다. 1) List comprehension netsted_list = [[1, 2], [3, 4], ["a", "b"]] new_list = [item for nested_list_sub in nested_list for item in nested_list_sub] print(new_list) # output [1, 2, 3, 4, 'a', 'b'] 리스트(nested_list) 안에 리스트(nested_list_sub)를 하나씩 가지고 와서(item), 새로운 리스트(new_list) 로 받아서 저장해주는 방법이다. List com.. 데이터/Data Manipulation 2022. 5. 29.

[3] Pandas transform : lambda 대신 데이터프레임에 사용가능, 하지만 더 다양하게. When to use Pandas transform() function | by B. Chen | Towards Data Science When to use Pandas transform() function Some of the most useful Pandas tricks towardsdatascience.com 데이터/Data Manipulation 2022. 5. 29.

[2] Pandas cut : 조건식 있는 loc 대신 쓸 수 있는 방법. bins=[0, 12, 19, 61, 100] labels=[' 데이터/Data Manipulation 2022. 5. 29.

[1] Pandas query : 한 번만 써보자! Introduction Pandas 를 사용하여 EDA 를 진행할 때, import seaborn as sns import matplotlib.pyplot as plt import os import numpy as np import pandas as pd iris = sns.load_dataset('iris') iris 에서 sepal_length 가 5~6 사이인 row만 뽑고 싶을 때, 보통 이렇게 많이 쓴다. iris[(iris.sepal_length>5) & (iris.sepal_length 데이터/Data Manipulation 2022. 5. 27.

이전 1 2 3 다음

티스토리툴바