87 episodes

Learn the latest data science updates in the tech world.

Data Science Tech Brief By HackerNoon HackerNoon

    • News

Learn the latest data science updates in the tech world.

    Outlier Detection: What You Need to Know

    Outlier Detection: What You Need to Know

    This story was originally published on HackerNoon at: https://hackernoon.com/outlier-detection-what-you-need-to-know.
    Decisions are usually based on the sample mean, which is very sensitive to outliers and can dramatically change the value. So, it is crucial to manage outliers
    Check more stories related to data-science at: https://hackernoon.com/c/data-science.
    You can also check exclusive content about #outlier-detection, #statistics, #python3, #variance-reducing, #what-is-outlier-detection, #bootstrap, #problem-formulation, #data-analysis, and more.


    This story was written by: @nataliaogneva. Learn more about this writer by checking @nataliaogneva's about page,
    and for more stories, please visit hackernoon.com.



    Analysts often encounter outliers in data during their work. Decisions are usually based on the sample mean, which is very sensitive to outliers. It is crucial to manage outliers to make the correct decision. Let's consider several simple and fast approaches for working with unusual values.

    • 2 min
    Instrument Variables and AB Testing – Part 1

    Instrument Variables and AB Testing – Part 1

    This story was originally published on HackerNoon at: https://hackernoon.com/instrument-variables-and-ab-testing-part-1.
    This article explores the Mathematical details of least squares estimator in an unbiased and biased settings due to model specification errors.
    Check more stories related to data-science at: https://hackernoon.com/c/data-science.
    You can also check exclusive content about #linear-regression, #ab-testing, #casual-analysis, #bias, #multicollinearity, #confounding, #instrument-variables, #instrument-variables-math, and more.


    This story was written by: @varunnakra1. Learn more about this writer by checking @varunnakra1's about page,
    and for more stories, please visit hackernoon.com.



    This article explores the Mathematical details of least squares estimator in an unbiased and biased settings due to model specification errors.

    • 1 min
    Using Arrow Flight SQL Protocol in Apache Doris 2.1 For Super Fast Data Transfer

    Using Arrow Flight SQL Protocol in Apache Doris 2.1 For Super Fast Data Transfer

    This story was originally published on HackerNoon at: https://hackernoon.com/using-arrow-flight-sql-protocol-in-apache-doris-21-for-super-fast-data-transfer.
    Apache Doris 2.1 just got a major speed boost with Arrow Flight SQL for up to 10x faster data transfers.
    Check more stories related to data-science at: https://hackernoon.com/c/data-science.
    You can also check exclusive content about #data-science, #database, #data-engineering, #pandas, #python, #apache-arrow, #big-data, #data, and more.


    This story was written by: @frankzzz. Learn more about this writer by checking @frankzzz's about page,
    and for more stories, please visit hackernoon.com.



    Apache Doris 2.1 supports Arrow Flight SQL protocol for reading data from Doris. It delivers tens-fold speedups compared to PyMySQL and Pandas.

    • 8 min
    Data Science for Portfolio Optimization: Markowitz Mean-Variance Theory

    Data Science for Portfolio Optimization: Markowitz Mean-Variance Theory

    This story was originally published on HackerNoon at: https://hackernoon.com/data-science-for-portfolio-optimization-markowitz-mean-variance-theory.
    The theory formulates a mathematical model to optimize the asset allocations to gain the maximum return for a given risk-level.
    Check more stories related to data-science at: https://hackernoon.com/c/data-science.
    You can also check exclusive content about #data-science, #asset-management, #modern-portfolio-theory, #portfolio-optimization, #markowtiz-mean-variance, #what-is-the-markowitz-theory, #portfolio-theory, #investment-portfolio-tips, and more.


    This story was written by: @kustarev. Learn more about this writer by checking @kustarev's about page,
    and for more stories, please visit hackernoon.com.



    An investment portfolio comprises various assets such as stocks and bonds. Every investor starts with a fixed investment capital and decides how much to invest in each asset. Data science techniques such as the Markowitz mean-variance theory help determine the optimal share allocation to build the optimal portfolio.

    The theory formulates a mathematical model to optimize the asset allocations to gain the maximum return for a given risk-level. It analyzes different financial assets and considers their rate of return and risk factors, given their historical trends. The rate of return is an approximation of how much profit the asset will generate over a given time period. The risk factor is quantified using the standard deviation of the asset value. A higher deviation represents a volatile asset and, hence, higher risk.

    The return and risk values are calculated for various portfolio combinations and are represented on the efficient frontier curve. The curve helps investors determine the highest returns against their selected risk.

    • 5 min
    10 Best Datasets for Time Series Analysis

    10 Best Datasets for Time Series Analysis

    This story was originally published on HackerNoon at: https://hackernoon.com/10-best-datasets-for-time-series-analysis.
    In order to understand how a certain metric varies over time and to predict future values, we will look at the 10 Best Datasets for Time Series Analysis.
    Check more stories related to data-science at: https://hackernoon.com/c/data-science.
    You can also check exclusive content about #datasets, #ai, #data-science, #data-analysis, #data-analytics, #data-visualization, #hackernoon-datasets, #machine-learning, and more.


    This story was written by: @datasets. Learn more about this writer by checking @datasets's about page,
    and for more stories, please visit hackernoon.com.



    Time series data is essentially a collection of data points organized in time. Time is frequently the independent variable, and the purpose is usually to forecast the future in time series. In this article, we will look at the *10 Best Datasets for Time Series Analysis,* in order to understand how a certain metric varies over time.

    • 8 min
    Understanding Scaling Law through Data Science Lenses

    Understanding Scaling Law through Data Science Lenses

    This story was originally published on HackerNoon at: https://hackernoon.com/understanding-scaling-law-through-data-science-lenses.
    Despite the immense promise of LMs, initial endeavors to apply pre-trained LMs to downstream tasks have encountered significant challenges.
    Check more stories related to data-science at: https://hackernoon.com/c/data-science.
    You can also check exclusive content about #data-science, #scaling-law, #artificial-intelligence, #tokenizing-raw-text, #discrete-tokens, #embedding-vector, #token-embeddings, #power-laws, and more.


    This story was written by: @tianchengxu. Learn more about this writer by checking @tianchengxu's about page,
    and for more stories, please visit hackernoon.com.



    Despite the immense promise of LMs as task-neutral foundation models, initial endeavors to apply pre-trained LMs to downstream tasks encountered significant challenges.

    • 14 min

Top Podcasts In News

The Rest Is Politics
Goalhanger Podcasts
The News Agents
Global
Leading
Goalhanger Podcasts
Electoral Dysfunction
Sky News
The Rest Is Money
Goalhanger Podcasts
Serial
Serial Productions & The New York Times