Blog

细说 Twitter 的登录流程

2024-02-09

#Twitter
#水

众所周知 Twitter 官方的客户端有三种常见的鉴权方式:guest_token、cookie 以及 OAuth token

Guest token

在登录前,除了少数不需要鉴权的接口(比如部分来自 Embedded Components 的接口),所有的接口请求都需要通过 guest_token 鉴权,获取 guest_token 也很简单,只需要使用 Bearer token 作为 Authorization 发请求即可,参考我早年的文章即可

下面是一些烂大街的情报:

  • guest_token 是跟 Bearer token 绑定的,不能随意更换 bearer token
  • 每个 IP 每 30 分钟能获取到的 guest_token 总量是 2000 个,这个数量与 bearer token 无关

网页版 Twitter 需要用到的凭证只有两个 cookie

  • auth_token
  • ct0

网页端的 Bearer token 如下,另外多说一句,经过我的调查,这个 token 已经没有任何无需登录访问时间线的权限(指定的无敏感标记的单条推文除外),所以在源码看到这个 TnA 结尾的 token 就可以知道项目要不就是需要登录要不已经失效了:

Bearer AAAAAAAAAAAAAAAAAAAAANRILgAAAAAAnNwIzUejRCOuH5E6I8xnZz4puTs%3D1Zv7ttfk8LF81IUq16cHjhLTvJu4FA33AGWWjCpTnA

获取 guest_token 略过,我们直入正题,来研究每一个可能遇到的 subtask

请求方法

我封装了一个方法用于发送向相关接口发送请求,这个方法同时适用于 cookie 和 oauth

const sendLoginRequest = async (bearer_token, guest_token, cookie = {}, headers = {}, query = new URLSearchParams({}), body = {}) => fetch(`https://api.twitter.com/1.1/onboarding/task.json${query.size > 0 ? `?${query.toString()}` : ''}`, {
    method: "POST",
    headers: {
        'content-type': 'application/json',
        authorization: bearer_token,
        'x-guest-token': guest_token,
        cookie: Object.entries(cookie).map(([key, value]) => `${key}=${value}`).join('; '),
        ...headers
    },
    body: JSON.stringify(body)
}).then(async response => ({
    message: '',
    cookies: Object.fromEntries([...response.headers.entries()].filter((header) => header[0] === 'set-cookie').map((header) => {
        const tmpCookie = header[1].split(';')[0]
        const firstEqual = tmpCookie.indexOf('=')
        return [tmpCookie.slice(0, firstEqual), tmpCookie.slice(firstEqual + 1)]
    })),
    content: await response.json()
})).then(res => {
    //console.log(res)
    return res
}).catch(error => {
    //console.error(error)
    return {
        message: error.message,
        cookies: {},
        content: {}
    }
})

login

const bearer_token = 'Bearer AAAAAAAAAAAAAAAAAAAAANRILgAAAAAAnNwIzUejRCOuH5E6I8xnZz4puTs%3D1Zv7ttfk8LF81IUq16cHjhLTvJu4FA33AGWWjCpTnA'

const guest_token = (await (await fetch('https://api.twitter.com/1.1/guest/activate.json', {
    method: "POST",
    headers: {
        authorization: bearer_token
    }
})).json()).guest_token

let cookie = {}
let headers = {}

const login = await sendLoginRequest(bearer_token, guest_token, cookie, headers, new URLSearchParams({
    flow_name: 'login'
}), {
    input_flow_data: { flow_context: { debug_overrides: {}, start_location: { location: 'unknown' } } },
    subtask_versions: {
        action_list: 2, alert_dialog: 1, app_download_cta: 1, check_logged_in_account: 1, choice_selection: 3, contacts_live_sync_permission_prompt: 0, cta: 7, email_verification: 2, end_flow: 1, enter_date: 1, enter_email: 2, enter_password: 5, enter_phone: 2, enter_recaptcha: 1, enter_text: 5, enter_username: 2, generic_urt: 3, in_app_notification: 1, interest_picker: 3, js_instrumentation: 1, menu_dialog: 1, notifications_permission_prompt: 2, open_account: 2, open_home_timeline: 1, open_link: 1, phone_verification: 4, privacy_options: 1, security_key: 3, select_avatar: 4, select_banner: 2, settings_list: 7, show_code: 1, sign_up: 2, sign_up_review: 4, tweet_selection_urt: 1, update_users: 1, upload_media: 1, user_recommendations_list: 4, user_recommendations_urt: 1, wait_spinner: 3, web_modal: 1
    }
})

这一步取得了

*LoginJsInstrumentationSubtask

这一步会得到一大段 js 代码,执行后可以得到一个 json 用于提交,实际上提交一个空的 object 也没问题……

// const JsInstCookie = fetch('https://twitter.com/i/js_inst?c_name=ui_metrics', {
//     headers: {
//         cookie: Object.entries(cookie).map(([key, value]) => `${key}=${value}`).join('; ')
//     }
// }).then(response => Object.fromEntries([...response.headers.entries()].filter((header) => header[0] === 'set-cookie').map((header) => {
//     const tmpCookie = header[1].split(';')[0]
//     const firstEqual = tmpCookie.indexOf('=')
//     return [tmpCookie.slice(0, firstEqual), tmpCookie.slice(firstEqual + 1)]
// }))).catch(error => ({}))
// 
// cookie = { ...cookie, ...await JsInstCookie }

const LoginJsInstrumentationSubtask = await sendLoginRequest(bearer_token, guest_token, cookie, headers, new URLSearchParams({}), {
    flow_token,
    subtask_inputs: [{
        js_instrumentation: {
            link: 'next_link',
            response: '{}'//<- if you wan to submit the real value, please go ahead
        },
        subtask_id: 'LoginJsInstrumentationSubtask'
    }]
})

然后可以得到:

  • * Cookie _twitter_sess,由于这个 cookie 并不是必须的,所以上面注释掉的内容确实可以不执行
  • flow_token

如果要计算真正的 js_instrumentation.response 的值,我这里有一段仅供参考的代码,原理是通过将内容解析成 ast 树,然后提取并执行需要的代码,由于代码里面还有些实际上并没有什么用的 dom 操作,我只好再写一个类来模拟……

// yarn add acorn
import { parse } from 'acorn'
const JsInstContent = '...'// text content from https://twitter.com/i/js_inst?c_name=ui_metrics

class MockDocument {
    globalBody = []
    constructor() {
        this.globalBody = []
        this.createElement('body')
    }
    createElement(tagName) {
        let children = []
        let newDom = {
            tagName,
            innerText: '',
            parentNode: '',
            get lastElementChild() {
                return this.children.length === 0 ? undefined : this.children[this.children.length - 1]
            },
            children,
            appendChild: (domHandle) => {
                domHandle.parentNode = newDom
            },
            removeChild: (_) => {}, //unnecessary
            setAttribute: (_, __) => {} //'display:none;' is unnecessary
        }
        this.globalBody.push(newDom)
        return newDom
    }
    getElementsByTagName(tagName) {
        return this.globalBody.filter((x) => x.tagName === tagName)
    }
}

globalThis.document = new MockDocument()

const astParse = parse(JsInstContent, { ecmaVersion: 'latest' })
const start = astParse.body[0].body.body[0].declarations[0].init.body.body[0].start
const end = astParse.body[0].body.body[0].declarations[0].init.body.body[0].end

const js_instrumentation = new Function(`const document=globalThis.document;return ${JsInstContent.slice(start, end)}()`)()
// {"rf":{"bafb594d182a7263a7737eb5aee962bdf7a3a6734c2c07500aa040a95b687792":-38,"cacc9185160911333be59d0b7a6733b492389a48875d2f788eebe6539faa9b3a":0,"ae9e77aeef536f719ab15915e88358737e35471a373e9f15d9503fa7ede76631":-149,"a1c1da4f154601d35f507612402d9cf12cc1fc97f91b11b4ecef39dc13177a8f":-13},"s":"DVqJ60up-YJhIv_WxKlGQCl67yYPBaQvtLdMiuqCW_B7dDr2j2ijdXz2kFYHCoe0Fo37AXj-g1U8B73sROvsSdaK8DcToPc_j2jr5C3Y_VfMG0n_nBf9ao3BsC-dPJOO5Nx0hZUuHhlIF6E8O1KXrXvYIUGx9W6Ctu3GffXFyv35nmMke9U7UeXD7V-gBOYAjSryScmnvrx33q3O6Ls8jQVT_a_qHjhVbLZdSUspTDE9oIETLIOGw9do_esqd99gG4D-sgz8VIcBLV-t6EDwHOQp9kqMXTKKLLCCrNupeLUyGNjb0yCgwFW9S5UsiZEgn_94VuONM4xIEe7SCePrpAAAAY14pzhE"}

LoginEnterUserIdentifierSSO

这一步可以提交 screen_name/邮箱/手机号,由于Discoverability by phone number/email restriction bypass,我建议这里的 account 应当直接提交 screen_name,否则下一步可能有一个验证 screen_name 或手机号的步骤

const LoginEnterUserIdentifierSSO = await sendLoginRequest(bearer_token, guest_token, cookie, headers, new URLSearchParams({}), {
    flow_token,
    subtask_inputs: [{
        settings_list: {
            link: 'next_link',
            setting_responses: [
                {
                    key: 'user_identifier',
                    response_data: {
                        text_data: {
                            result: account
                        }
                    }
                }
            ]
        },
        subtask_id: 'LoginEnterUserIdentifierSSO'
    }]
})

*LoginEnterAlternateIdentifierSubtask

如果上一步还是提交了邮箱,就有机会遇到 LoginEnterAlternateIdentifierSubtask

const LoginEnterAlternateIdentifierSubtask = await sendLoginRequest(bearer_token, guest_token, cookie, headers, new URLSearchParams({}), {
    flow_token,
    subtask_inputs: [{
        enter_text: {
            link: 'next_link',
            text: screen_name// or phone number
        },
        subtask_id: 'LoginEnterAlternateIdentifierSubtask'
    }]
})

LoginEnterPassword

这一步输密码

const LoginEnterPassword = await sendLoginRequest(bearer_token, guest_token, cookie, headers, new URLSearchParams({}), {
    flow_token,
    subtask_inputs: [{
        enter_password: {
            link: 'next_link',
            password
        },
        subtask_id: 'LoginEnterPassword'
    }]
})

AccountDuplicationCheck

这一步会检查是否需要二次验证,根据不同的绑定情况可能会有 邮箱验证码/短信验证码(我没开会员,所以这一步无法抓包)/硬件密钥/TOTP/一次性验证码

由于折腾硬件密钥比较麻烦,一次性验证码顾名思义只能使用一次,所以我只研究了 邮箱验证码TOTP 这两种常见的类型

const AccountDuplicationCheck = await sendLoginRequest(bearer_token, guest_token, cookie, headers, new URLSearchParams({}), {
    flow_token,
    subtask_inputs: [{
        check_logged_in_account: {
            link: 'AccountDuplicationCheck_false'
        },
        subtask_id: 'AccountDuplicationCheck'
    }]
})

如果不需要二次验证,这里会得到第一个登录凭证 auth_token

LoginAcid

没有开启二次验证的帐号有几率出现需要输入邮箱验证码的情况,这类账号如果在 LoginEnterPassword 提交了错误的密码就必然出现 LoginAcid

这个 acid 的值由数字和小写字母组成

const LoginAcid = await sendLoginRequest(bearer_token, guest_token, cookie, headers, new URLSearchParams({}), {
    flow_token,
    subtask_inputs: [{
        enter_text: {
            text: acid,
            link: 'next_link'
        },
        subtask_id: 'LoginAcid'
    }]
})

如果代码正确,这里会得到第一个登录凭证 auth_token

LoginTwoFactorAuthChooseMethod

有的帐号使用超过一种二次验证的方式(比如我同时使用 TOTP硬件密钥),导致默认选项不一定是 TOTP,这时候就需要将这个选项改成 TOTP(由于我无法使用短信验证码,所以我无法确定 selected_choices 的值为 ['0'] 时会不会一定修改成 TOTP)

const LoginTwoFactorAuthChooseMethod = await sendLoginRequest(bearer_token, guest_token, cookie, headers, new URLSearchParams({}), {
    flow_token,
    subtask_inputs: [{
        choice_selection: {
            link: 'next_link',
            selected_choices: ['0']
        },
        subtask_id: 'LoginTwoFactorAuthChooseMethod'
    }]
})

LoginTwoFactorAuthChallenge

这里需要提交那串六位数字,TOTP 是一个很有趣并且很成熟的二次验证方式,有机会我还会展开聊聊的

const LoginTwoFactorAuthChallenge = await sendLoginRequest(bearer_token, guest_token, cookie, headers, new URLSearchParams({}), {
    flow_token,
    subtask_inputs: [{
        enter_text: {
            link: 'next_link',
            text: totp // <- number string
        },
        subtask_id: 'LoginTwoFactorAuthChallenge'
    }]
})

完成后将会得到第一个登录凭证 auth_token

尽管这一步也会发放一个短位的 ct0 作为 csrf token,但作为帐号凭据的 ct0 仍然需要在下一步发放

Viewer

这里能获取到帐号的基本信息,还能拿到 ct0

* 请求的 cookie 只需要 auth_tokenct0。如果不需要用户信息,只要取得长 ct0 甚至可以不要短位 ct0

const getViewer = async (bearer_token, cookie, viewerQueryID, viewerFeatures) => fetch(`https://api.twitter.com/graphql/${viewerQueryID}/Viewer?` + new URLSearchParams({
    variables: JSON.stringify({ withCommunitiesMemberships: true, withSubscribedTab: true, withCommunitiesCreation: true }),
    features: JSON.stringify(viewerFeatures)
}).toString(), {
    headers: {
        authorization: bearer_token,
        'x-csrf-token': cookie.ct0,
        cookie: Object.entries(cookie).map(([key, value]) => `${key}=${value}`).join('; ')
    }
}).then(async response => ({
    message: '',
    cookies: Object.fromEntries([...response.headers.entries()].filter((header) => header[0] === 'set-cookie').map((header) => {
        const tmpCookie = header[1].split(';')[0]
        const firstEqual = tmpCookie.indexOf('=')
        return [tmpCookie.slice(0, firstEqual), tmpCookie.slice(firstEqual + 1)]
    })),
    content: await response.json()
})).then(res => {
    //console.log(res)
    return res
}).catch(error => {
    //console.error(error)
    return {
        message: error.message,
        cookies: {},
        content: {}
    }
})

const viewer = await getViewer(bearer_token, cookie, 'qevmDaYaF66EOtboiNoQbQ', { "responsive_web_graphql_exclude_directive_enabled": true, "verified_phone_label_enabled": false, "creator_subscriptions_tweet_preview_api_enabled": true, "responsive_web_graphql_skip_user_profile_image_extensions_enabled": false, "responsive_web_graphql_timeline_navigation_enabled": true })

cookie = { ...cookie, ...viewer.cookies }
//...

然后?

直接在浏览器中新增或替换 auth_tokenct0 即可直接登入对应的 twitter 帐号

或者拿着这两个 cookie 做需要用到它们的事情

  • 我收集了大多数流程中可能会遇到的 response,当中涉及个人信息的内容已经被我涂抹,请看这里

错误处理

如果流程失败了,就会返回一段 json

此时可以继续使用当前 flow_token

{
    "errors": [
        {
            "code": 399,
            "message": "Incorrect. Please try again."
        }
    ]
}

OAuth

……先别急,更新的

实现的代码请先参考 https://github.com/zedeus/nitter/issues/983#issuecomment-169002582

authenticate_web_view

拥有 OAuth token 和 secret 时可以透过这个接口取得 auth_token,进而透过 #Viewer 取得 ct0

怎么计算 OAuth 签名我就不赘述了,请参考 怎么爬 Twitter(Android)

const cookies = await fetch("https://twitter.com/account/authenticate_web_view", {
    headers: {
        "content-type": "application/json",
        Authorization: oauth_signature_builder(...), // -> /posts/how-to-crawl-twitter-with-android
    },
  },
).then((response) =>
  Object.fromEntries([...response.headers.entries()].filter((header) =>
        header[0] === "set-cookie"
    ).map((header) => {
        const tmpCookie = header[1].split(";")[0]
        const firstEqual = tmpCookie.indexOf("=")
        return [tmpCookie.slice(0, firstEqual), tmpCookie.slice(firstEqual + 1)]
    })
  )
)

console.log(cookies)

// {
//     ...
//     auth_token: "..."
// }

其他

  • 现在还想做爬虫怎么办?力大砖飞呗。准备几千上万个帐号和代理池,应该可以做到无视 rate limit 和封号……
  • 本文的完整代码请看这里
  • 当前仍在运行的 Twitter Monitor Api 其实是支持 cookie 登录的(当然我也不建议在任何公开实例进行登入操作,因为帐号密码以及最后返回的cookie都有可能被实例记录,请自己部署),具体操作请参考twitter monitor的相关代码

细说 Twitter 的登录流程

https://blog.nest.moe/posts/how-to-login-to-twitter

转载或引用本文时请遵守知识共享署名许可


评论区