Blog

怎么爬 Twitter(Android)

2023-08-07

#Twitter
#Twitter Monitor
#Twitter Graphql
#Twitter Api

时间又过去了两年,两年间发生了太多的事情,前两篇的内容都受到了不同程度的破坏,尽管 bearer token + guest token 的模式还能继续走下去,但也是时候找一个备用的方案了。

Guest Account

跟网页端用 guest_token 不一样,未登录帐号的的客户端使用的是一种拥有较低访问权限的临时账号:能同时使用多数客户端以及网页的端点,部分端点还拥有较高的 rate limit,具体情况请参考Rate limit status同时根据Twitter Developer Platform的说法: Access tokens are not explicitly expired,这种帐号的令牌一经签发,理论上永久有效。 经过实际测试,有效期为一个月左右。

但是,这些帐号并不是无敌的,短期内多次发送请求会导致对应账号失去全部时间线相关(包括但不限于 时间线、带回复时间线、单条推文、搜索、列表以及社群)端点的访问权限一天;每天能获取到这些帐号的数量也是随机的,一般每天每 ip 可以获取到20个左右,近期没访问过 Twitter 的 ip 可以获取到上百甚至几百个,但大量获取的后果是 ip 被拉黑,后面几天可能一个号都获取不到。

我测试时循环(没有停顿)六七次不行了,调用搜索的接口会死得更快,同一ip多死几个号以后,后面的请求基本上是一个临时号一天只能用一次,第二次就被搞。@banka

客户端

为了避免出现意外,我选择的是七月出事前的最后一个版本 9.95.0-release.0,设备是刷了“最新” OxygenOS(Android 9) 的 OnePlus 3T,root 和安装证书可以参考 在 Android 9 的设备抓包,实际上根据 Nitter 的 issue,直到 10.1 都还支持这一特性。

Consumer key & secret

由于我早就通过其他渠道拿到了 consumer keyconsumer secret,所以这里只不过是照着结论推过程。

由于这段是写完后面的内容再写的,外加 consumer keyconsumer secret 是固定的,所以我只是随便在网上找了一个 apk 文件放到 JADX 拆包,拆包后的变量名可能会跟我这里不一样

⚠ 我不确定这段在未来会不会被相关人士要求删除

  • 直接搜索关键字 /oauth2/token,可能会什么都没有,但正常情况至少会有一个结果
    Node
    defpackage.qvb.b() yec<yg, TwitterErrors>aVar.m("/oauth2/token", "/")
    ......
  • 简单分析一下逻辑就会发现 dk1.c() 用于计算 base64,那 gg2 就分别是 consumer keyconsumer secret
      public final class qvb extends d1m<yg, TwitterErrors> {
          public yg Z2;
          public final mhi a3;
    
          public qvb() {
              String str = gst.d;
              this.a3 = new mhi(r3d.b().A1());
              f();
          }
    
          @Override // defpackage.d1m, defpackage.pec, defpackage.zv0, defpackage.cw0, defpackage.xec
          public final yec<yg, TwitterErrors> b() {
              vvf.c cVar = new vvf.c(yg.class);
              tdc.a aVar = new tdc.a();
              aVar.m("/oauth2/token", "/");
              int i = uji.a;
              cec Z = Z(aVar.j().a(mqt.a()));
              Z.h = bec.b.x;
              Z.j = cVar;
              Z.b(h6f.q(new xt1("grant_type", "client_credentials")));
              bec d = Z.d();
              rhi rhiVar = this.a3.a;
              String g = rs1.g(rhiVar.a);
              String g2 = rs1.g(rhiVar.b);
              d.B("Authorization", "Basic ".concat(dk1.c((g + ":" + g2).getBytes())));
              d.d();
              if (d.w()) {
                  this.Z2 = (yg) cVar.c;
              }
              return yec.a(d, cVar);
          }
    
          @Override // defpackage.pec, defpackage.xec
          public final String m() {
              return mqt.a().b;
          }
      }
    
  • 再往上分析就找到下面这一段,简单合并一下就出来了
      package defpackage;
    
      /* compiled from: Twttr */
      /* renamed from: gst  reason: default package */
      /* loaded from: classes5.dex */
      public final class gst extends ust {
          public static final String d;
          public static final String e;
      
          static {
              byte[] bArr = {-29, -88, -64, -95, -61, -89, -44, -68, -88, -98, -32, -63, -30, -96, -100, -63, -98, -80, -31, -97};
              byte[] bArr2 = {-44, -77, -93, -31, -35, -47, -48, -76, -76, -93, -78, -48, -32, -61, -86, -35, -56, -81, -33, -27, -93, -87, -81, -61, -94, -65, -47, -49, -97, -66, -66, -53, -61, -84, -67, -96, -58, -64, -94, -33, -91, -99, -93};
              StringBuilder sb = new StringBuilder(20);
              for (int i = 0; i < 20; i++) {
                  sb.append((char) (22 - bArr[i]));
              }
              d = sb.toString();
              StringBuilder sb2 = new StringBuilder(43);
              for (int i2 = 0; i2 < 43; i2++) {
                  sb2.append((char) (22 - bArr2[i2]));
              }
              e = sb2.toString();
          }
      
          public gst() {
              super(d, e);
          }
      }
    

Bearer token

跟网页版一样,这个 bearer token 理论上是不会变的,更多关于这个令牌的信息请看这里

// 环境要求,下同
// Node v18.15.0 / Deno / Bun...
const TW_CONSUMER_KEY = '3nVuSoBZnx6U4vzUxf5w'
const TW_CONSUMER_SECRET = 'Bcs59EFbbsdF6Sl9Ng71smgStWEGwXXKSjYvPVt7qys'

const TW_ANDROID_BASIC_TOKEN = `Basic ${btoa(TW_CONSUMER_KEY+':'+TW_CONSUMER_SECRET)}`

const getBearerToken = async () => {
    const tmpTokenResponse = await (await fetch('https://api.twitter.com/oauth2/token', {
        headers: {
            Authorization: TW_ANDROID_BASIC_TOKEN,
            'Content-Type': 'application/x-www-form-urlencoded'
        },
        method: 'post',
        body: 'grant_type=client_credentials'
    })).json()
    return Object.values(tmpTokenResponse).join(" ")
}

const bearer_token = await getBearerToken()
// Bearer AAAAAAAAAAAAAAAAAAAAAFXzAwAAAAAAMHCxpeSDG1gLNLghVe8d74hl6k4%3DRUMF4xAQLsbeBhTSRrCiQpJtxoGWeyHrDb5te2jpGskWDFW82F

oauth_token & oauth_token_secret

这里用到的 bearer_token 在前面获得

取得 Guest token

const guest_token = (await (await fetch("https://api.twitter.com/1.1/guest/activate.json", {
    headers: {
        Authorization: bearer_token
    },
    method: "post"
})).json()).guest_token

取得第一个 flow_token

const flow_token = (await (await fetch('https://api.twitter.com/1.1/onboarding/task.json?flow_name=welcome&api_version=1&known_device_token=&sim_country_code=us', {
    headers: {
        Authorization: bearer_token,
        'Content-Type': 'application/json',
        'User-Agent': 'TwitterAndroid/9.95.0-release.0 (29950000-r-0) ONEPLUS+A3010/9 (OnePlus;ONEPLUS+A3010;OnePlus;OnePlus3;0;;1;2016)',
        'X-Twitter-API-Version': 5,
        'X-Twitter-Client': 'TwitterAndroid',
        'X-Twitter-Client-Version': '9.95.0-release.0',
        'OS-Version': '28',
        'System-User-Agent': 'Dalvik/2.1.0 (Linux; U; Android 9; ONEPLUS A3010 Build/PKQ1.181203.001)',
        'X-Twitter-Active-User': 'yes',
        'X-Guest-Token': guest_token
    },
    method: 'post',
    body: '{"flow_token":null,"input_flow_data":{"country_code":null,"flow_context":{"start_location":{"location":"splash_screen"}},"requested_variant":null,"target_user_id":0},"subtask_versions":{"generic_urt":3,"standard":1,"open_home_timeline":1,"app_locale_update":1,"enter_date":1,"email_verification":3,"enter_password":5,"enter_text":5,"one_tap":2,"cta":7,"single_sign_on":1,"fetch_persisted_data":1,"enter_username":3,"web_modal":2,"fetch_temporary_password":1,"menu_dialog":1,"sign_up_review":5,"interest_picker":4,"user_recommendations_urt":3,"in_app_notification":1,"sign_up":2,"typeahead_search":1,"user_recommendations_list":4,"cta_inline":1,"contacts_live_sync_permission_prompt":3,"choice_selection":5,"js_instrumentation":1,"alert_dialog_suppress_client_events":1,"privacy_options":1,"topics_selector":1,"wait_spinner":3,"tweet_selection_urt":1,"end_flow":1,"settings_list":7,"open_external_link":1,"phone_verification":5,"security_key":3,"select_banner":2,"upload_media":1,"web":2,"alert_dialog":1,"open_account":2,"action_list":2,"enter_phone":2,"open_link":1,"show_code":1,"update_users":1,"check_logged_in_account":1,"enter_email":2,"select_avatar":4,"location_permission_prompt":2,"notifications_permission_prompt":4}}'
})).json()).flow_token
//g;4174100000134946:-1691400000588:S0Jot3jIr00000M23lT3jJCk:0

得到帐号 or 失败结束

const subtasks = (await (await fetch('https://api.twitter.com/1.1/onboarding/task.json', {
    headers: {
        Authorization: bearer_token,// Bearer ...
        'Content-Type': 'application/json',
        'User-Agent': 'TwitterAndroid/9.95.0-release.0 (29950000-r-0) ONEPLUS+A3010/9 (OnePlus;ONEPLUS+A3010;OnePlus;OnePlus3;0;;1;2016)',
        'X-Twitter-API-Version': 5,
        'X-Twitter-Client': 'TwitterAndroid',
        'X-Twitter-Client-Version': '9.95.0-release.0',
        'OS-Version': '28',
        'System-User-Agent': 'Dalvik/2.1.0 (Linux; U; Android 9; ONEPLUS A3010 Build/PKQ1.181203.001)',
        'X-Twitter-Active-User': 'yes',
        'X-Guest-Token': guest_token
    },
    method: 'post',
    // 下一行的 `flow_token` 就是上一项获得的那个
    body: '{"flow_token":"' + flow_token + '","subtask_inputs":[{"open_link":{"link":"next_link"},"subtask_id":"NextTaskOpenLink"}],"subtask_versions":{"generic_urt":3,"standard":1,"open_home_timeline":1,"app_locale_update":1,"enter_date":1,"email_verification":3,"enter_password":5,"enter_text":5,"one_tap":2,"cta":7,"single_sign_on":1,"fetch_persisted_data":1,"enter_username":3,"web_modal":2,"fetch_temporary_password":1,"menu_dialog":1,"sign_up_review":5,"interest_picker":4,"user_recommendations_urt":3,"in_app_notification":1,"sign_up":2,"typeahead_search":1,"user_recommendations_list":4,"cta_inline":1,"contacts_live_sync_permission_prompt":3,"choice_selection":5,"js_instrumentation":1,"alert_dialog_suppress_client_events":1,"privacy_options":1,"topics_selector":1,"wait_spinner":3,"tweet_selection_urt":1,"end_flow":1,"settings_list":7,"open_external_link":1,"phone_verification":5,"security_key":3,"select_banner":2,"upload_media":1,"web":2,"alert_dialog":1,"open_account":2,"action_list":2,"enter_phone":2,"open_link":1,"show_code":1,"update_users":1,"check_logged_in_account":1,"enter_email":2,"select_avatar":4,"location_permission_prompt":2,"notifications_permission_prompt":4}}'
})).json()).subtasks

const account = subtasks.find(task => task.subtask_id === 'OpenAccount')?.open_account
console.log(account)

如果这里是 undefined 的话可能是下面两种情况:

  • 意味着 ip 很不幸地被限制了,建议换一个 ip 或者过几天再来。通过滥用大厂的云服务来刷访客帐号是很难的,至少在 CloudFlare workers 是做不到的
  • 帐号创建流程被卡住了,需要过一会(数秒到数分钟不等)再进行一次同样的请求
    • Also for account creation I can't workout whether there is some fundamental delay in generation of the oauth guest accounts requiring a second call to the open_link to get an account open or whether it's the rotation of IP that does it but you can go through patches of accounts opening right away and sometimes requiring 3+ calls for it to happen. @ImTheDeveloper

    • Sometimes it appears there is a delay in the creation of the account so you need to wait a few seconds/minutes and then it will create if you call the next_link again. @ImTheDeveloper

如果这里得到了一个对象,那这个 account 对象就差不多长下面这样

{
    "user": {
        "id": 168862000062124800,
        "id_str": "168862000062124800",
        "name": "Open App User",
        "screen_name": "_LO_08072W00Z6G",
        "user_type": "Soft"
    },
    "next_link": {
        "link_type": "subtask",
        "link_id": "next_link",
        "subtask_id": "OpenAppFlowStartAccountSetupOpenLink"
    },
    "oauth_token": "168862000062124800-yOxTZxJc4nKGGJ0lik000069JgJJX",
    "oauth_token_secret": "PSrSIwXo0000RvWvcwQ0000dLgay0000NbpvSztF6n",
    "attribution_event": "signup"
}

screen_name 的结构是 _LO_ + 当天日期 + 7位随机大小写字母或数字

建议记录 oauth_token, oauth_token_secret, screen_name

uid 记不记无所谓,这类账号是无法通过 uidscreen_name 来查找的

创建 OAuth 签名

参考 Creating a signature 即可

大概原理就是给 payload 排序后合并成字符串然后算 HMAC-SHA1,最简单的办法就是去找个算 OAuth 签名的包

// browser
const buffer_to_base64 = buf => {
    let binary = '';
    const bytes = new Uint8Array(buf);
    for (var i = 0; i < bytes.byteLength; i++) {
        binary += String.fromCharCode(bytes[i]);
    }
    return btoa(binary)
}

// Node.js
const buffer_to_base64 = buf => buf.toString('base64')

// The oauth_nonce parameter is a unique token your application should generate for each unique request. Twitter will use this value to determine whether a request has been submitted multiple times. The value for this request was generated by base64 encoding 32 bytes of random data, and stripping out all non-word characters, but any approach which produces a relatively random alphanumeric string **should be OK** here.
// So just fill in any value you want.
const getOauthAuthorization = async (oauth_token, oauth_token_secret, method = 'GET', url = '', body = '', timestamp = Math.floor(Date.now() / 1000), oauth_nonce = btoa(new Array(2).fill(Math.random().toString()).join('').slice(4)).replaceAll('+', '').replaceAll('/', '').replaceAll('=', '')) => {
    if (!url) {
        return ''
    }
    method = method.toUpperCase()
    const parseUrl = new URL(url)
    const link = parseUrl.origin + parseUrl.pathname
    const payload = [...parseUrl.searchParams.entries()]
    if (body) {
        let isJson = false
        try {
            JSON.parse(body)
            isJson = true
        } catch (e) {}
        if (!isJson) {
            payload.push(...new URLSearchParams(body).entries())
        }
    }
    payload.push(['oauth_version', '1.0'])
    payload.push(['oauth_signature_method', 'HMAC-SHA1'])
    payload.push(['oauth_consumer_key', TW_CONSUMER_KEY])
    payload.push(['oauth_token', oauth_token])
    payload.push(['oauth_nonce', oauth_nonce])
    payload.push(['oauth_timestamp', String(timestamp)])

    const forSign =
        method + '&' + encodeURIComponent(link) + '&' + new URLSearchParams(payload.sort((a, b) => (a[0] > b[0] ? 1 : a[0] < b[0] ? -1 : 0))).toString().replaceAll('+', '%20').replaceAll('%', '%25').replaceAll('=', '%3D').replaceAll('&', '%26')
    let key = await crypto.subtle.importKey("raw", new TextEncoder('utf-8').encode(TW_CONSUMER_SECRET + '&' + (oauth_token_secret ? oauth_token_secret : '')), { name: "HMAC", hash: "SHA-1" }, false, ["sign", "verify"])
    let sign = await crypto.subtle.sign('HMAC', key, new TextEncoder('utf-8').encode(forSign))
    return {
        method,
        url,
        parse_url: parseUrl,
        timestamp,
        oauth_nonce,
        oauth_token,
        oauth_token_secret,
        oauth_consumer_key: TW_CONSUMER_KEY,
        oauth_consumer_secret: TW_CONSUMER_SECRET,
        payload,
        sign: buffer_to_base64(sign)
    }
}
const OAuthSign = getOauthAuthorization(account.oauth_token, account.oauth_token_secret, 'GET', "https://api.twitter.com/graphql/G8jKRx5LiyrRDs5FcsUjsw/SearchTimeline?variables=%7B%22includeTweetImpression%22%3Atrue%2C%22query_source%22%3A%22typed_query%22%2C%22includeHasBirdwatchNotes%22%3Afalse%2C%22includeEditPerspective%22%3Afalse%2C%22includeEditControl%22%3Atrue%2C%22query%22%3A%22aaaa%22%2C%22timeline_type%22%3A%22Top%22%7D&features=%7B%22longform_notetweets_inline_media_enabled%22%3Atrue%2C%22super_follow_badge_privacy_enabled%22%3Atrue%2C%22longform_notetweets_rich_text_read_enabled%22%3Atrue%2C%22super_follow_user_api_enabled%22%3Atrue%2C%22unified_cards_ad_metadata_container_dynamic_card_content_query_enabled%22%3Atrue%2C%22super_follow_tweet_api_enabled%22%3Atrue%2C%22android_graphql_skip_api_media_color_palette%22%3Atrue%2C%22creator_subscriptions_tweet_preview_api_enabled%22%3Atrue%2C%22freedom_of_speech_not_reach_fetch_enabled%22%3Atrue%2C%22creator_subscriptions_subscription_count_enabled%22%3Atrue%2C%22tweetypie_unmention_optimization_enabled%22%3Atrue%2C%22longform_notetweets_consumption_enabled%22%3Atrue%2C%22subscriptions_verification_info_enabled%22%3Atrue%2C%22blue_business_profile_image_shape_enabled%22%3Atrue%2C%22tweet_with_visibility_results_prefer_gql_limited_actions_policy_enabled%22%3Atrue%2C%22super_follow_exclusive_tweet_notifications_enabled%22%3Atrue%7D") // 如果是 POST 还需要提供 body
const authorization = `OAuth realm="http://api.twitter.com/", oauth_version="1.0", oauth_token="${OAuthSign.oauth_token}", oauth_nonce="${OAuthSign.oauth_nonce}", oauth_timestamp="${OAuthSign.timestamp}", oauth_signature="${encodeURIComponent(OAuthSign.sign)}", oauth_consumer_key="${OAuthSign.oauth_consumer_key}", oauth_signature_method="HMAC-SHA1"`

其中,realm 是固定不变的 http://api.twitter.com/;当 body 不是 application/x-www-form-urlencodedbody 无需参与计算

在后续的请求里面这个 authorization 的值将会替代 bearer_token

如果感觉还是不太明白可以查看 在线签名页面 直接上手体验

*queryId & featureSwitches

*请注意这部分只是提取用于 Twitter Monitor 的 queryIdfeatureSwitches,并不是所有人都需要用到的,除非你需要精细控制一些内容的特性。一般情况下这部分内容只需要当成字符串粘贴即可

由于实在看不下去 Java 源码,我只写了个脚本一键从抓包的请求里面提取并给转换成跟 web 版差不多的格式

// 这里只列了我需要用的,其他可以自行抓包添加
const list = [
    'https://na.albtls.t.co/graphql/oPppcargziU1uDQHAUmH-A/UserResultByIdQuery?variables=%7B%22include_smart_block%22%3Atrue%2C%22includeTweetImpression%22%3Atrue%2C%22includeTranslatableProfile%22%3Atrue%2C%22includeHasBirdwatchNotes%22%3Afalse%2C%22include_tipjar%22%3Atrue%2C%22include_highlights_info%22%3Atrue%2C%22includeEditPerspective%22%3Afalse%2C%22include_reply_device_follow%22%3Atrue%2C%22includeEditControl%22%3Atrue%2C%22include_verified_phone_status%22%3Afalse%2C%22rest_id%22%3A%22780211%22%7D&features=%7B%22verified_phone_label_enabled%22%3Afalse%2C%22creator_subscriptions_subscription_count_enabled%22%3Atrue%2C%22super_follow_badge_privacy_enabled%22%3Atrue%2C%22subscriptions_verification_info_enabled%22%3Atrue%2C%22super_follow_user_api_enabled%22%3Atrue%2C%22blue_business_profile_image_shape_enabled%22%3Atrue%2C%22super_follow_exclusive_tweet_notifications_enabled%22%3Atrue%7D',
    'https://na.albtls.t.co/graphql/3JNH4e9dq1BifLxAa3UMWg/UserWithProfileTweetsQueryV2?variables=%7B%22includeTweetImpression%22%3Atrue%2C%22includeHasBirdwatchNotes%22%3Afalse%2C%22includeEditPerspective%22%3Afalse%2C%22includeEditControl%22%3Atrue%2C%22count%22%3A20%2C%22rest_id%22%3A%222373%22%2C%22includeTweetVisibilityNudge%22%3Atrue%2C%22autoplay_enabled%22%3Atrue%7D&features=%7B%22longform_notetweets_inline_media_enabled%22%3Atrue%2C%22super_follow_badge_privacy_enabled%22%3Atrue%2C%22longform_notetweets_rich_text_read_enabled%22%3Atrue%2C%22super_follow_user_api_enabled%22%3Atrue%2C%22unified_cards_ad_metadata_container_dynamic_card_content_query_enabled%22%3Atrue%2C%22super_follow_tweet_api_enabled%22%3Atrue%2C%22android_graphql_skip_api_media_color_palette%22%3Atrue%2C%22creator_subscriptions_tweet_preview_api_enabled%22%3Atrue%2C%22freedom_of_speech_not_reach_fetch_enabled%22%3Atrue%2C%22creator_subscriptions_subscription_count_enabled%22%3Atrue%2C%22tweetypie_unmention_optimization_enabled%22%3Atrue%2C%22longform_notetweets_consumption_enabled%22%3Atrue%2C%22subscriptions_verification_info_enabled%22%3Atrue%2C%22blue_business_profile_image_shape_enabled%22%3Atrue%2C%22tweet_with_visibility_results_prefer_gql_limited_actions_policy_enabled%22%3Atrue%2C%22super_follow_exclusive_tweet_notifications_enabled%22%3Atrue%7D',
    'https://api-0-5-0.twitter.com/graphql/8IS8MaO-2EN6GZZZb8jF0g/UserWithProfileTweetsAndRepliesQueryV2?variables=%7B%22includeTweetImpression%22%3Atrue%2C%22includeHasBirdwatchNotes%22%3Afalse%2C%22includeEditPerspective%22%3Afalse%2C%22includeEditControl%22%3Atrue%2C%22count%22%3A20%2C%22rest_id%22%3A%221449200000377%22%2C%22includeTweetVisibilityNudge%22%3Atrue%2C%22autoplay_enabled%22%3Atrue%7D&features=%7B%22longform_notetweets_inline_media_enabled%22%3Atrue%2C%22super_follow_badge_privacy_enabled%22%3Atrue%2C%22longform_notetweets_rich_text_read_enabled%22%3Atrue%2C%22super_follow_user_api_enabled%22%3Atrue%2C%22unified_cards_ad_metadata_container_dynamic_card_content_query_enabled%22%3Atrue%2C%22super_follow_tweet_api_enabled%22%3Atrue%2C%22android_graphql_skip_api_media_color_palette%22%3Atrue%2C%22creator_subscriptions_tweet_preview_api_enabled%22%3Atrue%2C%22freedom_of_speech_not_reach_fetch_enabled%22%3Atrue%2C%22creator_subscriptions_subscription_count_enabled%22%3Atrue%2C%22tweetypie_unmention_optimization_enabled%22%3Atrue%2C%22longform_notetweets_consumption_enabled%22%3Atrue%2C%22subscriptions_verification_info_enabled%22%3Atrue%2C%22blue_business_profile_image_shape_enabled%22%3Atrue%2C%22tweet_with_visibility_results_prefer_gql_limited_actions_policy_enabled%22%3Atrue%2C%22super_follow_exclusive_tweet_notifications_enabled%22%3Atrue%7D',
    'https://api.twitter.com/graphql/G8jKRx5LiyrRDs5FcsUjsw/SearchTimeline?variables=%7B%22includeTweetImpression%22%3Atrue%2C%22query_source%22%3A%22typed_query%22%2C%22includeHasBirdwatchNotes%22%3Afalse%2C%22includeEditPerspective%22%3Afalse%2C%22includeEditControl%22%3Atrue%2C%22query%22%3A%22aaaa%22%2C%22timeline_type%22%3A%22Top%22%7D&features=%7B%22longform_notetweets_inline_media_enabled%22%3Atrue%2C%22super_follow_badge_privacy_enabled%22%3Atrue%2C%22longform_notetweets_rich_text_read_enabled%22%3Atrue%2C%22super_follow_user_api_enabled%22%3Atrue%2C%22unified_cards_ad_metadata_container_dynamic_card_content_query_enabled%22%3Atrue%2C%22super_follow_tweet_api_enabled%22%3Atrue%2C%22android_graphql_skip_api_media_color_palette%22%3Atrue%2C%22creator_subscriptions_tweet_preview_api_enabled%22%3Atrue%2C%22freedom_of_speech_not_reach_fetch_enabled%22%3Atrue%2C%22creator_subscriptions_subscription_count_enabled%22%3Atrue%2C%22tweetypie_unmention_optimization_enabled%22%3Atrue%2C%22longform_notetweets_consumption_enabled%22%3Atrue%2C%22subscriptions_verification_info_enabled%22%3Atrue%2C%22blue_business_profile_image_shape_enabled%22%3Atrue%2C%22tweet_with_visibility_results_prefer_gql_limited_actions_policy_enabled%22%3Atrue%2C%22super_follow_exclusive_tweet_notifications_enabled%22%3Atrue%7D',
    'https://na.albtls.t.co/graphql/83h5UyHZ9wEKBVzALX8R_g/ConversationTimelineV2?variables=%7B%22referrer%22%3A%22profile%22%2C%22includeTweetImpression%22%3Atrue%2C%22includeHasBirdwatchNotes%22%3Afalse%2C%22isReaderMode%22%3Afalse%2C%22includeEditPerspective%22%3Afalse%2C%22includeEditControl%22%3Atrue%2C%22focalTweetId%22%3A167600000992%2C%22includeCommunityTweetRelationship%22%3Atrue%2C%22includeTweetVisibilityNudge%22%3Atrue%7D&features=%7B%22longform_notetweets_inline_media_enabled%22%3Atrue%2C%22super_follow_badge_privacy_enabled%22%3Atrue%2C%22longform_notetweets_rich_text_read_enabled%22%3Atrue%2C%22super_follow_user_api_enabled%22%3Atrue%2C%22unified_cards_ad_metadata_container_dynamic_card_content_query_enabled%22%3Atrue%2C%22super_follow_tweet_api_enabled%22%3Atrue%2C%22android_graphql_skip_api_media_color_palette%22%3Atrue%2C%22creator_subscriptions_tweet_preview_api_enabled%22%3Atrue%2C%22freedom_of_speech_not_reach_fetch_enabled%22%3Atrue%2C%22creator_subscriptions_subscription_count_enabled%22%3Atrue%2C%22tweetypie_unmention_optimization_enabled%22%3Atrue%2C%22longform_notetweets_consumption_enabled%22%3Atrue%2C%22subscriptions_verification_info_enabled%22%3Atrue%2C%22blue_business_profile_image_shape_enabled%22%3Atrue%2C%22tweet_with_visibility_results_prefer_gql_limited_actions_policy_enabled%22%3Atrue%2C%22super_follow_exclusive_tweet_notifications_enabled%22%3Atrue%7D',
    'https://api.twitter.com/graphql/w9iN3QyYsynBlEXr9h6M2Q/TranslateProfileQuery?variables=%7B%22includeTweetImpression%22%3Atrue%2C%22includeHasBirdwatchNotes%22%3Afalse%2C%22includeEditPerspective%22%3Afalse%2C%22includeEditControl%22%3Atrue%2C%22rest_id%22%3A111%7D',
    'https://api.twitter.com/graphql/hE1HCUzioO9QSLpvIBvvYA/TranslateTweetQuery?variables=%7B%22includeTweetImpression%22%3Atrue%2C%22includeHasBirdwatchNotes%22%3Afalse%2C%22includeEditPerspective%22%3Afalse%2C%22tweet_id%22%3A111%2C%22includeEditControl%22%3Atrue%7D'
]

const queryString = list
    .map((x) => {
        const tmpParse = new URL(x)
        const tmpPath = tmpParse.pathname.split('/')
        const operationName = tmpPath.pop()
        const queryId = tmpPath.pop()
        //operationType: "query"
        const features = JSON.parse(tmpParse.searchParams.get('features') || '{}')
        const variables = JSON.parse(tmpParse.searchParams.get('variables'))
        const data = {
            queryId: queryId,
            operationName: operationName,
            operationType: 'query',
            metadata: { featureSwitches: Object.keys(features) },
            features: features
        }
        return `export const _${operationName} = ${JSON.stringify(data)}`
        //"metadata":{"featureSwitches"
    })
    .join('\n')
//writeFileSync('./androidQueryIdList.js', queryString)

取得返回内容

客户端接口返回内容的结构跟网页版 v2_timeline 接口的基本一致,具体情况请参考前段时间 Twitter Monitor 的提交(其实是我也记不清了)

常见的失败内容包括:签名错误,帐号过期,帐号被临时拉黑,其他错误请参考 Response codes and errors

// 临时拉黑
{ "errors": [ { "message": "Rate limit exceeded", "code": 88 } ] }
// 帐号过期
{ "errors": [ { "message": "Invalid or expired token", "code": 89 } ] }
// 签名错误
{ "errors": [ { "message": "Could not authenticate you", "code": 32 } ] }

一些别的

  • 如果你拿到自己账号的 oauth_tokenoauth_token_secret,也可以像 创建 OAuth 签名 那样创建签名去请求接口
  • 关于有效期我自己也还在摸索,我的第一个有记录的帐号创建自 Fri Jul 07 05:28:49 +0000 2023,如果后续情况有变我会更新本文
  • 风控非常严厉,勤换 ip 多屯号,有备无患 有效期仅一个月,屯号无用
  • 目前的最优解并不是使用这些临时账号,而是用 新版TweetDeck (aka X Pro) 的 bearer token 去获取 guest token,观测脚本的结果 显示多数接口都能正常使用。
  • Nitter 的 相关 issue 里面还有各路人士提供了 js、python、golang、powershell、bash 的实现
  • 截至目前 Twitter api 并不支持 ipv6,因此请准备 ipv4 连接
  • 目前我使用 BANKA2017/twitter-monitor ~/apps/open_account 搭建私有帐号池,获取访客帐号的脚本分别部署在两台 vps 上面,每个月总共能获取到大约 250 个访客帐号
  • 关于选购代理池的事情我不熟悉,请查看 轻松创建一万个 Twitter 账号

参考

怎么爬 Twitter(Android)

https://blog.nest.moe/posts/how-to-crawl-twitter-with-android

转载或引用本文时请遵守知识共享署名许可


评论区