How to identify outliers on a box and whisker plot that seems to be compressed?
Data Science Stack Exchange
by Susmitha Acharya
2d ago
I have plotted box plots for the features of an ML problem, to identify outliers. I have scaled the data using a MinMaxScaler so that the scaled data is in the range [0,1]. For some columns, the two quartiles are visible clearly, but for some features the first and the third quartile are exactly the same. How can I identify outliers for such features? [![Box and whisker plot**][1]**][1] [1]: https://i.stack.imgur.com/hkMc8.jpgstrong text ..read more
Visit website
How does dependency information impact binary classification in multi-label prediction models?
Data Science Stack Exchange
by malisokan
2d ago
TL;DR: I don't understand the dependency issue with binary classification (binary relevance) compared to multi-label prediction models. I often read in papers that some kind of "dependency information" is missing when a binary classification is used for a multi-label prediction problem. I mean a multi-label problem is solved with single binary classifiers. Let's take a simple data set with multiple target variables: Age Gender TargetA TargetB TargetC 34 m 1 1 1 22 f 0 0 1 45 m 1 0 0 If the goal is to predict future TargetA, TargetB and TargetC, then I see these possibilities ..read more
Visit website
Tensorflow SegNet architecture
Data Science Stack Exchange
by D .Stark
2d ago
I was unable to find a complete description of the SegNet architecture for image segmentation (specifically, the decoder layers). Therefore, I would like to clarify the correctness of my implementation (schematically): Input(x, x, 3) Conv2d(64)+BatchNormalization+ReLU Conv2d(64)+BatchNormalization+ReLU MaxPoolWithArgMax Conv2d(128)+BatchNormalization+ReLU Conv2d(128)+BatchNormalization+ReLU MaxPoolWithArgMax Conv2d(256)+BatchNormalization+ReLU Conv2d(256)+BatchNormalization+ReLU Conv2d(256)+BatchNormalization+ReLU MaxPoolWithArgMax Conv2d(512)+BatchNormalization+ReLU Conv2d(512)+BatchNorm ..read more
Visit website
Trying to read items from invoice extractor bot with open ai langchain llm
Data Science Stack Exchange
by Mcore8x
5d ago
Trying out this tutorial for the invoice extraction using open ai llm but the code seems to be working well only with invoices with single item, if there are more than one item in the invoice that seems to be not picking up items from 2nd row in the invoices. This source code is capable to pick up from first item from multiple invoices but how to modify it to pick up all items from a multi-item invoice? Tutorial https://www.analyticsvidhya.com/blog/2023/10/building-invoice-extraction-bot-using-langchain-and-llm/ source code https://mega.nz/file/9ERV1IZI#iMNm_bzFMnssaIv2rAprYD9qhYILLP6R4J7r7rOq ..read more
Visit website
Saving ML models with pickle to be deployed using Flask
Data Science Stack Exchange
by Kehinde Olatunji
5d ago
I trained some ensemble Ml to predict, I needed to save with pickle so as to be able to deploy using Flask. To save with pickle I have tried several methods and read several articles but could not get a clue, when trying to use Linear Regression in flask I got error the LR is not defined. please how do I save the entire Ensemble models with pickle command. And the write path to deploy in Flask to be able to make predictions. Thank you for your help ..read more
Visit website
Multiclass matrix loss function in scikit-learn / xgboost / lightgbm
Data Science Stack Exchange
by Avi T
5d ago
I have data with 4 classes: $c_1, c_2, c_3, c_4$. I'd like to create a classifier which has different scaling for the loss function per class combination: $$ \begin{bmatrix} 0 & l \left( \hat{c}_{1}, {c}_{2} \right) & l \left( \hat{c}_{1}, {c}_{3} \right) & l \left( \hat{c}_{1}, {c}_{4} \right) \\ l \left( \hat{c}_{2}, {c}_{1} \right) & 0 & l \left( \hat{c}_{2}, {c}_{3} \right) & l \left( \hat{c}_{2}, {c}_{4} \right) \\ l \left( \hat{c}_{3}, {c}_{1} \right) & l \left( \hat{c}_{3}, {c}_{2} \right) & 0 & l \left( \hat{c}_{3}, {c}_{4} \right) \\ l \left( \hat{c ..read more
Visit website
How will weights learn in CNN for multi class classification?
Data Science Stack Exchange
by Jai
5d ago
How are the weights of filters in a CNN can learn meaningful features in multiclasses classification if they keep changing as different images are passed through the network during training.Say we are doing multi class classification using CNN and my doubt is say there are 5 classes, and and no of kernels/filters are say 10, so let's say my first image is a pen, and we pass it through the model and kernel weights will be changed right, and then say we pass an image of a book and then the weights will again be changed right? So if the weights of filters keep changing how will it learn anything ..read more
Visit website
Recreating results from Research Paper
Data Science Stack Exchange
by Panos_42
5d ago
so I have been trying to recreate the results from this particular paper (Neural Collaborative Filtering). The dataset I use closely resembles this . I understand that I should my data into train and testing sets. The question I have is whether or not I should create the test.negative file myself or if it is automatically handled by negative sampling inside the code (which basically contains the negative feedback based on the absence of data). I would really appreciate your feedback! Thanks in advance. Here is the official implementation of this paper on github ..read more
Visit website
Elastic Net alpha value using GLMNET 4.1-8
Data Science Stack Exchange
by user162172
5d ago
Is it a valid method to “brute force” the alpha value for an elastic net? What I mean is trying alpha = .1, .2, .3, .4 and so on to 1.0 and looking at the highest R-squared value of each and choosing the corresponding alpha to use for the model going forward? If so, is R2 the best metric to use to determine the best alpha ..read more
Visit website
Improve my f1_score for classification - pandas/sklearn
Data Science Stack Exchange
by user162343
5d ago
I would like advice on how to improve my f1_score for classification. I currently have something around 0.57. Dataset: lotWaferDie - lot, board and chip on which defects were measured string values like W02-D12_11,.. XRel - relative position of the defect in the axis X YRel - relative position of the defect in the Y axis XSize - the size of the defect in the X axis YSize - the size of the defect in the Y axis DefArea - defect area DefSize - the size of the defect dieRow - the row of the defect on the board dieCol - defect column on the board xidx - index of the defect line on the board yidx ..read more
Visit website

Follow Data Science Stack Exchange on FeedSpot

Continue with Google
Continue with Apple
OR